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ABSTRACT 


Effectively allocating Navy recruiting resources to recruiting stations requires an under¬ 
standing of the market factors that influence a station’s expected production. Currently, 
Navy Recruiting Command (NRC), N5 (Research) uses an index constructed from 18 fac¬ 
tors describing the stations operating environment to identify a stations recruiting potential. 
This research develops alternative models to identify a station’s potential, both over a three- 
year and monthly period and compares this to models using only the index value. Regres¬ 
sion techniques provide models with more explanatory power than models using only the 
constructed index, while using only a subset of the 18 factors for the current index. These 
models will be used by NRC N5 to inform market analysis and by NRC N7 (Training) 
within their training game for Navy Recruit District (NRD) leaders. 
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Executive Summary 


Military recruiting is an arduous task that requires manpower and financial resources. Navy 
recruit district leaders are in charge of a number of recruiting stations that can span sev¬ 
eral states. These leaders have to manage all personnel, stations, and budgets within their 
districts. One of the difficulties for these leaders is the short amount of time they have to 
understand the recruiting environment prior to making resource allocation decisions that 
affect their mission. In order to address this training shortfall, an interactive training game 
was developed which allows recruiters to practice allocating resources in order to maxi¬ 
mize the number of recruits they obtain. The current version of the game does not have an 
empirically based predictive model that generates accession totals for each station. 

The purpose of this research is to develop a statistical model that predicts a Navy Recruiting 
Station’s (NRS’s) expected number of enlisted accessions and to identify the most relevant 
predictors of a station’s recruiting potential. The model will be utilized within the train¬ 
ing game for chief recruiters being developed by NRC N7 (Training) and to assist market 
analysis conducted by NRC N5 (Research). This will benefit NRC in two ways; first, NRD 
leaders will have a better understanding of the how the decisions they make affect their 
mission and second, the model will help ensure leaders are being trained to attend to the 
most important indicators associated with successful recruiting. 

This study utilized multiple data sets provided by NRC N5 in order to develop models to 
predict short-term and long-term accessions. Selected variables from each data set were 
consolidated to form two data sets; one used to develop a three-year model and another 
used to develop a monthly model. 

Numerous models were created for short-term and long-term accession predictions. This 
study found that the best model to utilize for long-term predictions is the three-year multiple 
linear regression model. The response variable is the is the total number of three-year 
accessions by station. The predictor variables are the average number of Navy recruiters at 
each NRS over the past 12 months, the competition index, the four-year average number 
of national leads divided by the population of an area, the distance between an NRS and 
the Military Entrance Processing Station (MBPS), and the percentage of individuals with 




a minimum level of education of a high school diploma. It has the highest explanatory 
capability and lowest residual standard error of any three-year prediction model. The best 
short-term prediction model is the Poisson monthly model. The response variable is the 
same as the total number of monthly accessions by NRS. The predictor variables are all 
factors referenced before in the three-year model and the month the observation took place, 
the total number of 17-24 year olds per square mile within an NRS’s boundaries, and the 
maximum distance 75% of recruits have to travel to get to the NRS. It has one of highest 
explanatory capabilities and also is the only model that meets all its assumptions. Both of 
these models performed better than short-term and long-term univariate models based on 
the Noble Index Value (NIV). Additionally, this study found that the most important factors 
for short-term and long-term accessions are: 

1. The average number of recruiters over last 12 months at each NRS. This is the most 
important variable in all models. This may relate to work capacity. 

2. The month in which the the person accessed. February to September have more 
predicted accessions than January. October to December have fewer predicted acces¬ 
sions than January (Only included in short-term predictions). 

3. The competition index for each NRS. A higher index value correlates with fewer 
accessions. This may indicate areas with greater competition with other services are 
harder to recruit from. 

4. The four-year average number of national leads divided by the population of an area. 
A higher number of leads results in more accessions. This may correlate with level 
of interest an area has with regard to joining the Navy. 

5. The distance from a NRS to the MBPS. A further distance results in lower accessions. 
More time spent driving may decrease time spent recruiting. 

6. The percentage of individuals within an area with a minimum education level of a 
high school diploma. A higher percentage results in more accessions. This may 
relate to less time spent finding qualified applicants. 

This study recommends the Poisson monthly model be utilized within the training game. 
The Poisson model has one of the best explanatory capabilities of any monthly model 
explored, accounts for the small number of occurrences within the response variable, and 
produces only non-negative integers. Additionally, this model is the only model out of all 



five that meets all of its assumptions. This means that it can be utilized for inference via 
confidence intervals and hypothesis testing, as well as for point estimation. 

Future work should explore the development of a zero-inflated Poisson monthly model to 
account for the large number of zeroes in teh response variable. Further research of differ¬ 
ent factors could also be explored in order to increase the models’ explanatory capability, 
these include NRS proximity to military installations, median income, college attendance 
rates, and unemployment rates. 
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CHAPTER 1: 
Introduction 


1.1 Introduction 

Naval Recruiting Command is the organization in charge of recruiting thousands of civil¬ 
ians to fill billets needed within the Navy. It is responsible for determining the number of 
accessions, also referred to as recruits, required to keep the Navy supplied with enough 
personnel to complete its mission, while minimizing the cost associated with obtaining 
those individuals. Naval Recruiting Command (NRC) is broken down into two recruiting 
regions. Region West and Region East, which are then broken down into Navy Recruiting 
Districts (NRD) with a total of 26 Districts. 


Figure 1.1: U.S. Map of Navy Recruiting Districts , from [1] 
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These districts are further broken down into the Navy Recruiting Stations (NRSs). Station 
numbers vary year to year based on closures of different stations, there are 979 [2] as 
of March 2014. The Navy currently has 4,144 enlisted and officer production recruiters 
charged with finding qualified civilians to enlist or commission in the Navy [3]. District 
leaders are responsible for setting recruitment goals and managing recruitment resources 
available to their district. One difficulty recruiting leaders face is identifying the best areas 
to recruit from and invest resources into. Training NRD leaders is difficult because of the 
limited amount of time they have prior making resource allocation decisions. It is important 
to do this in order to properly allocate resources to the right areas and and meet the mission 
set forth by their commanders. 

In order to address this training shortfall, NRC N5, the Strategic Plans, Research, and 
Analysis branch of Navy Recruiting Command, developed a training game which allows 
NRD leaders to practice allocating resources while attempting to maximize the number of 
recruits obtained. The goal is to train NRD leaders to identify areas that are more likely to 
produce better results. N7, the training branch of NRC, and N5 identified the need for a 
statistical model to use in both forecasting accessions (N5), and for use within the training 
game (N7). 


1.2 Problem Statement 

The purpose of this research is to develop a statistical model that predicts a Navy Recruiting 
Station’s expected number of enlisted accessions and identify the most relevant predictors 
of a station’s recruiting potential. The word “produce” in this study refers to signed con¬ 
tracts for accessions (recruits) who are going to basic training. The model developed will 
be utilized within the training game for recruiters and to inform market analysis conducted 
by NRC N5. This will benefit NRD leaders because they will be able to experiment with 
different resource allocation strategies prior to actually implementing them. This will ben¬ 
efit NRC because their leaders will have a better understanding of how the decisions they 
make are likely to affect recruiting outcomes. The empirically based model will also en¬ 
sure that leaders are being trained to attend to the most important indicators associated with 
successful recruiting. 
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1.3 Overview of Research 

This thesis is organized in the following manner: Chapter 2 covers previous research on 
Navy enlisted recruitment, a description of the training game the model will be utilized 
in, and the model that N5 NRC currently uses to predict an area’s propensity to produce 
accessions. Chapter 3 is broken down into two sections: a description of the data sets 
and the methodology used in the analysis. Chapter 4 focuses on the analysis of the data 
and models created during this research. Lastly, Chapter 5 provides recommendations for 
which model to utilize and future work to improve upon what has been completed. 


3 



THIS PAGE INTENTIONALLY LEET BLANK 


4 



CHAPTER 2: 
Background 


2.1 Literature Review 

This chapter focuses on previous work that is relevant to the proposed research. The Navy 
has been conducting analysis on recruiting models since 1973 when the all-volunteer force 
began. At that time, America’s previous major engagements were supplied with men who 
were drafted into the military. Studies on Navy recruitment range from ways to predict the 
number of recruits needed to fill the ranks of the Navy, to studies on enlistment bonuses 
and recruiting station placement. The following sections describe the most relevant and 
recent analysis that have been completed on Navy recruitment. The chapter concludes with 
a short summary and explanation of the “Make Goal” interactive game. 

2.1.1 An Analysis of Navy Recruiting Goal Allocation Model 

Pinelis et ah, under the Center for Naval Analysis, conducted a study of Navy Recruit¬ 
ing Command’s recruit “goaling” model in 2011 [4]. The purpose of their work was to 
determine the best way to allocate recruitment goals to geographic locations. Their anal¬ 
ysis focused on maximizing available market information and efficient use of recruiting 
resources. They also wanted to have the correct demographics represented within their 
model due to an increased focus on diversity within the Navy. The study analyzed the five 
different types of personnel that are recruited into the Navy. These categories were enlisted 
active, enlisted reserve, officer active, officer reserve, and medical. The study concluded 
that the enlisted goaling model that NRC was utilizing only provided goals to the district 
level. This did not allow an efficient use of resources because it left the districts with the 
task of assigning goals to the recruiting stations. 

Pinelis et al. developed two models that utilized ZIP code level data. The first model was 
called the count model. This model utilized specific information regarding the distances 
of the closest college/universities, size of college or university, the intersection of size and 
distance of college or university, areas with multiple colleges or universities within them, 
and historically black colleges or universities. The second model utilized distance to Navy 
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Recruiting Stations, demographic data, Navy Awareness Index, number of recruiters for 
each service, and crime data. All of these factors predicted the number of net new contracts 
the navy would attain in a certain fiscal year (FY). 

This study recommended the adoption of a ZIP code level model to provide greater accu¬ 
racy in recruit predictions down to the station level and improve the utilization of recruiting 
resources. The authors further suggested using the enlisted goaling model to allocate NRD 
goals and using the ZIP code level model to set goals for the stations within the NRD [4]. 


2.1.2 Navy Enlistment Supply Model at the Recruiting Station Level 

In 2008, McRoberts completed a thesis that built a model to predict the number of high- 
quality male Navy enlistments at the Navy Recruiting Station (NRS) level [5]. Addition¬ 
ally, his study wanted to see if there was a relationship between the number of high-quality 
recruits and the proximity of military installations (including NRS), as well as various 
measures of public high school data within the NRS’ area of responsibility (AOR) [5]. He 
utilized three techniques, ordinary least squares multiple linear regression modeling, re¬ 
gression trees, and neural networks. For his response variable, he utilized the number of 
males with Armed Forces Qualification Test (AFQT) scores fifty or higher who entered the 
Navy’s Delayed Entry Program (DEP). His predictor variables consisted of several distance 
measurements with regard to NRS location, number of high schools within an NRS’ AOR, 
and the proximity to the closest military facilities. Additionally, other variables included 
demographic data of males within the recruiting stations’ AOR, economic variables, and 
the average number of recruiters within an area. Eor a more detailed view of each variable, 
please consult Table 3 of McRoberts’ thesis. He concluded that the neural network model 
was the best model to utilize when trying to predict high-quality male Navy enlistments. 
Additionally, he found an increase in quality male accessions is related to a closer prox¬ 
imity of military installations to the NRS, higher student to teacher ratios, lower "Promot¬ 
ing Power" scores, and lower percentages of students on subsidized lunches. "Promoting 
Power" is a measure of high school graduation rates. 
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2.1.3 Predicting the Number of Potential Military Recruits over the 
Next 10 Years with Application to Recruiter Placement 

In 2007, Britton completed a thesis that focused on the placement of recruiters within 
an area based on a ten year forecast of the number of recruits produced from a specific 
ZIP code [6]. The data focused on demographics data of military applicants from 1998 to 
2006, recruiter assignment histories, recruiting station ZIP codes, and predicted populations 
within each ZIP code. He developed what he called the “Propensenator”, the estimate of 
the number of recruits from a specific ZIP code who will apply for the military [6]. He 
then applied the Propensenator to each ZIP code. The results of his study were that the 
Navy did well in allocating the correct number of recruiters to the the right stations. Britton 
recommended, prior to recruiter assignment, utilizing the two tables created from his study. 
Those tables would allow leaders to check historical manning of a each NRS, determine 
recruiter effectiveness, and identify the propensity index for each NRS. 


2.1.4 Enlistment Supply at the Local Market Level 

Hogan et al. conducted a study for the Directorate of Accession Policy, which focused on 
predicting the number of high-quality enlistments for both the Army and Navy and the 
effects of the location of the recruiting stations within a certain ZIP code [7]. High-quality 
enlistments mean all individuals who scored in the top 50% of the AFQT The data utilized 
in this study came from the Navy Steam Database which includes information about the 
number of Navy production recruiters assigned to a specific station each quarter, ZIP codes 
in the station’s territory, and the ZIP code of where the station is located [7]. They also 
utilized two different types of income data from the 1990 census, county unemployment 
rates, and the total number of 17-21 year old males within each ZIP code [7]. This study 
built two models, one at the ZIP code level and one at the recruiting station level. The 
analysis indicated that Navy recruiters were more productive in areas where Navy and 
Army recruiting stations and high schools were present. Additionally, higher-income areas 
and further distances between an NRS and a given ZIP code were both associated with 
lower enlistment rates [7]. 
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2.1.5 Summary of Literature Review 

In several of the studies a ZIP eode level model is reeommended for use in order to provide 
more aeeurate predietions on the number of Navy aeeessions. All of these studies utilized 
some form of regression analysis to generate predietions and utilized data sets that were 
fairly similar. This thesis provides further researeh on navy aeeessions utilizing regression 
models and different predietor variables than previously utilized. These variables are from 
data sets provided by NRC and are eurrently used by NRC leadership for analysis of the re- 
eruiting market. Additionally, this thesis provides a station-level regression model to utilize 
in foreeasting aeeessions and the training game to assist NRD leadership with reeruitment 
resouree alloeation strategies. The training game is deseribed in the next seetion. 


2.2 Make Goal Interactive Training Game 

Within the U.S. Military, it is eommon to serve between 2-4 years in a eertain job. One 
ean imagine that it takes time until a serviee member is able to fully understand and ef- 
feetively do the job required of them. The purpose of the Make Goal game is to better 
train ineoming Navy reeruiting leaders. The game, played at the distriet level, attempts to 
train leaders on how to better utilize resourees and personnel in order to meet reeruiting 
goals while minimizing the eosts assoeiated with doing so. Initially, the player analyzes 
the data available within the game. This ineludes variables sueh as the eompetition index, 
the number of recruiters assigned to each station and additional market indicators. Next, 
the player has to decide where to place the recruiters based on his or her examination of the 
data available within the district. Figure 2.1 is a display of what the game looks like, using 
NRD Minneapolis as an example. The game interface provides the player a visual display 
of all recruiting stations, the number of recruiters, competition index of each area, major 
highways. Military Entrance Processing Stations (MBPS), and market size. Currently, the 
game does not utilize an empirical model to determine the number of recruits obtained by 
the player. Implementing an empirically based model trains NRD leadership how to allo¬ 
cate resources to meet their mission. The Noble index is one model that is being considered 
for implementation into the game. 
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Figure 2.1: Make Goal Game Interface 
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2.2.1 The Noble Index 

The Noble index was ereated by Mike Evans and Robby Powell from NRC (N5) [8]. Its 
purpose is to determine market potential of a speeifie geographie area with regard to the 
loeation of potential reeruits. According to the report, the analysis aims to identify the best 
markets in which to position recruiters so as to reduce recruiting resources with regard to 
the amount of personnel and money utilized in attaining recruits [8]. 

Noble Index Variables, Calculations, and Results 

To utilize the Noble Index, Evans and Powell selected several variables that characterize 
a geographic area’s recruitable market [8]. The resulting index is a value for each NRS 
that describes the strength of a geographic recruiting market. The index values range from 
18 to 188, higher values indicating a better market [8]. According to Evans and Powell, 
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an index value of less than 90 means that the area should be examined for realignment, 
an index of 90-110 means the area is minimally acceptable and should not be considered 
for realignment, and an index of greater than 110 indicates a properly aligned area. The 
index is useful to NRD leadership in deciding whether the recruiting stations are properly 
aligned. A properly aligned station means the correct ZIP codes are within the stations’ area 
of responsibility [9]. The factors within the Noble Index are not empirically derived; they 
are based on the analysis from NRC N5. Additionally, the weights used to combine factors 
within the Noble Index were provided by subject matter experts, not through analysis of 
the data available. These variables are described in Table 2.1. For a description of how to 
calculate the Noble Index value, please reference [8]. 

2.2.2 A Way Forward 

For NRD leaders to be successful in their job, it is essential for them to conduct quantita¬ 
tive analysis of their districts. By incorporating an empirically based model within Make 
Goal, it provides NRD leadership an accurate portrayal of how resource allocation deci¬ 
sions affect recruitment goals and prevents negative transfer training. This thesis focuses 
on building regression models that are capable of predicting the number of accessions ob¬ 
tained by each station. Additionally, this work provides a statistical model to use in place 
of the Noble Index. The next chapter describes the methodology and the data sets utilized 
within this thesis. 
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Variables 

Population Density 


QMA to Population 
Ratio 

Percent HSDG 


Leads Per Capita 


Percent Non-White 


Qualified Military 
Available (QMA) 
Quality Ratio 
Percent Top 22 DoD 
Market Segments 


Percent Bottom 22 
DoD Market Segments 
Quality Female QMA 
to Total QMA Ratio 


Percent Quality 
Accessions that are 
Female 

Percent Top 22 Female 
Market Segments 
Enlistment Screening 
Test (EST) to Total 
Navy Accessions Ratio 
75th Percentile Recruit 
Distance to NRS 
NRS Distance to 
Military Entrance 
Processing Station 

(MEPS) _ 

EMR Competition 
Index 


DoD Quality NMSM 
Index 


DoD Quality Female 
Index 


3 Year Average Quality 
All Service Accession 
Data (ASAD) 


Table 2.1: Variable Description for Noble Index Data Set 

Definitions Impact to Recruiting 

Measure of how many people live in a square Low pop density means harder to recruit. Requires 
mile more resources and travel time 

Percentage of population that is Qualified Higher QMA ratio means easier to recruit from area 

Military Available (QMA) and less resources utilized. 

Percentage of populations with min education Low HSDG rates require more resources to find 
of high school diploma qualified recruits 

4 year average of national leads divided by Higher leads indicate interest in military service, 

population of area Roughly equates with propensity to serve. 

Percent of population that is non-white Higher means more diverse area, easier for recruiters 

to find qualified diverse populations 

Percentage of QMA estimated within the Higher ratio means easier for recruiter to find 

Armed Forces Qualification Test ( AFQT) highly-desired recruits. 

Test Score Category (TSC) T-TTTA 

Market Segments from Navy Market Higher percentage is likely to produce accession at 

Segmentation Model (NMSM) with highest higher rate. 40-50% of annual accessions come from 
conversion rate (ie. highest rate of accessions) Top 22 segments. 

Market Segments from NMSM with lowest Higher percentage means lower accessions rate which 
conversion rate requires more resources from recruiters 

Ratio of quality female QMA count to the Increased emphasis from Navy to recruit more Female 

total QMA. Estimates percent of QMA that is sailors. Higher ratio means easier for recruiters to find 

quality female qualified female recruits 

Percentage of quality accessions that are Higher percentage indicates easier area for recruiters 

female to find quality female recruits 


Percentage of top 22 market segments from Higher percentage means easier for recruiters to find 
NMSM female recruits 

Ratio of EST’s given to accessions within Pseudo measure of level of effort. Higher rate means 

geographic area lower level of effort to filter through making recruiting 

easier 

Max distance 75% of recruits have to travel to Higher rate means more resources required through 
get to Navy Recruiting Station driving time for both recruits and recruiters 

Distance between Recruiting Station and the Higher distance means more resources required to get 
nearest MEPS recruits to MEPS. Increased driving time leads to less 

time actually recruiting g 


Indication of ’tightness’ of recruiting market Higher index means area is more contested between 

the services. Navy recruiter will require more 
resources to reach the Navy’s ’fair share’ of recruits 
Index from NMSM that compares estimated High index tends to have higher proportion of people 

DoD quality accessions rate for geographic within best accessing segments. Is easier for recruiters 

area compared to national average to find someone who will access 

Index from NMSM that compares estimated High index means higher proportion of people within 
quality female accessions rate for geographic best accessing segments. Easier for recruiters to find a 
area compared to national average female who will access 

Average quality accessions for past 3 years High average means easier for recruiters to find 

quality recruits. High average may indicate a need to 
reduce recruiting resources in area. 
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CHAPTER 3: 

Data and Methodology 


This chapter is broken into two different sections. The first is a description of the data 
sets utilized within the research. The second is a description of the methodology of the 
research. The purpose of this chapter is to provide an overview of the data and methods 
used during the analysis. 

3.1 Data Description 

All data sets were provided by NRC, specifically the Research, Analysis, and Modeling 
division (N52). NRC provided data from the Enlisted Market Report (EMR), monthly 
accessions data, and data used to construct the Noble Index. These data sets contain infor¬ 
mation which is valuable to understanding the recruiting environment. These three sources 
of raw data were used to construct two separate additional data sets used during this analy¬ 
sis. The first of these is the three-year model data set and the second is the monthly model 
data set. The remainder of this section explains each of the data sources in detail. 

3.1.1 Enlisted Market Report 

The first data set is the Enlisted Market Report (EMR). NRC personnel update this data set 
every six months in March and September. The EMR provides data used to analyze the 
recruiting market for each station [9]. The EMR data supports NRC leadership decisions: 
1.) if an area has the correct number of recruiters at each stations, 2.) whether different ZIP 
codes should be assigned to a different NRS, 3.) or whether to close an NRS. The EMR 
consists of several variables which are described in detail in Appendix A. The data utilized 
within this research is from the March 2014 EMR. 

The EMR consists of 79 variables. The first several variables deal with coding the spe¬ 
cific station to its respective NRD, the Recruit Station Identification Number and Station 
Name. The next several variables describe how competitive the recruiting market is, giv¬ 
ing yearly totals of prior accessions within the specific station’s AOR, and the average 
number of recruiters within the station’s AOR by military branch. The next set of vari¬ 
ables break down the total number of males within the station boundaries by age, race. 
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AFQT scores, and education. Additional variables determine how many prior service in¬ 
dividuals there are in a specific area as well as how many people signed up for selective 
service by race. The next set of variables include how many local and national leads there 
are within the station’s AOR over the past three fiscal years. The last variables describe 
market-specific data. Such as, estimating how many individuals there are by certain ages 
and market segments. These market segments come from the Navy’s Market Segmentation 
Model (NMSM), which break down the entire population into 66 distinct segments. About 
50% of the Navy’s yearly accessions come from the top 22 market segments [10]. 


3.1.2 Noble Index Data Set 

The second data set supports the construction of the Noble Index. The purpose of the Noble 
Index is to best describe a geographic area’s recruitable market. It assists in determining 
if an NRS is properly aligned [8]. Recruiting in areas with higher index values may result 
in a reduction of recruiting costs [8]. It is built at the station level, but can be modified 
to the ZIP code, district, or regional level. The authors suggest updating the Noble Index 
bi-annually in conjunction with the EMR [8]. 

There are 18 variables within the Noble Index data set. These variables include location 
data, such as distances from each NRS to the closest Military Entrance Processing Station 
(MEPS) and the maximum distance 75% of the recruits travel to get to the NRS. The next 
set of variables include demographic data such as how many people live in a specific area, 
the percentage of high school graduates there are within an area, and the percentage of 
diversity present within an area. Additional variables deal with specific market information. 
These include: 

• the percentage of individuals within the top 22 and bottom 22 market segments of 
the NMSM 

• the percentage of the population considered to be qualified for military service 

• the four year average number of national leads divided by the population of the area 

• the percentage of Qualified Military Available estimated to be quality accessions 

• an index value from the NMSM of quality accession rates compared to the national 
average 

• the competition index of each NRS 
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The next variables deal with accession data. These variables include: 


• three-year average quality accessions for all services 

• the ratio of Enlisted Screening Tests (EST) given to accessions 

• the percent of quality accessions that are female 

The last variables deal with female-specific data. These variables are: 

• the percentage of females within the top 22 market segments of the NMSM 

• the ratio of quality females considered QMA to the total QMA count of an area 

• an index value from the NMSM of the estimated quality female accession rate for an 
area compared to the national average 

Eor a detailed explanation of all variables within the Noble Index data set and how to 
calculate the Noble Index value for each NRS, please refer to Table 2.1 at the end of Chapter 
2 or [8]. 

3.1.3 Monthly Accession Data Set 

The third data set is monthly accession data, which cover fiscal year 2011 through fiscal 
year 2013. It consists of all Navy accessions by ZIP code, regardless of test scores, race, 
age, or gender for 36 months from October of 2010 through September 2013. This in¬ 
formation is aggregated to the station level based on the current alignment of the stations. 
This data serves as the base for our response variable within the different models built 
throughout this thesis. Eigure 3.1 is a snapshot of what the data set looks like once it is 
aggregated to the station level. RSID is the Recruit Station Identification Number and the 
X columns are the year and month (XYYYYMM). Eor example, RSID number 102001 had 
4 accessions in October 2010. 

3.1.4 Three-Year Model Data Set 

Using elements from the previously described data sets we constructed a data set to support 
the development of a three-year model. As mentioned before, this consolidated data set 
takes variables from the EMR, Noble Index, and monthly accession data sets. It consists of 
all variables within the Noble Index data set, the average number of Navy recruiters over the 
past 12 months for each NRS, the Noble Index value for each NRS, and the total number 
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Figure 3.1: Sample of Monthly Accession Data by Station 
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of accessions over three years for eaeh NRS. To generate the total number of three year 
aeeessions, the monthly accession data is first aggregated from the ZIP eode level to the 
station level. Next, the monthly aeeession data is summed aeross all months to give a total 
number of accessions over three years for each NRS. This results in one column named 
TotACCSum. For a detailed explanation of the remaining variables please see Table 3.1. 

3.1.5 Monthly Model Data Set 

Using elements from the previously mentioned data sets, we construeted a data set to sup¬ 
port the development of a monthly model. This ineludes all variables within the Noble 
Index data set, the average number of Navy reeruiters over the past 12 months, the Noble 
Index value for eaeh NRS, and monthly accession data. To do this, the monthly acces¬ 
sions are aggregated from the ZIP eode level to the station level. This gives the number 
of monthly aeeessions for each recruiting station. Next, two variables were eonstructed 
from this data. The first variable is the date variable ealled "mth," which is a categorical 
variable that allows the model to aeeount for the month the observation took plaee. The 
second variable is the number of accessions for each specific NRS per month called "Num- 
Reeruits." The NumReeruits variable ereates 36 observations per NRS resulting in 35,244 


16 














































observations (979 stations x 36 observations per NRS). Please refer to Table 3.2 for a full 
description of each variable. 

3.2 Methodology 

There are several ways of building predictive models. This thesis utilized linear regression 
and Poisson regression. This section describes why these two methods were utilized, the 
model assumptions associated with both, and the verification techniques that were utilized 
to validate the models. 

3.2.1 Linear Regression 

Linear regression provided a starting point for our use case since it can make a prediction of 
future observations and relationships can be determined between the response and predictor 
variables. Additionally, these models are easy to interpret because of their underlying linear 
equation. Least squares estimation is the technique utilized for linear regression. A linear 
regression model takes the general form of: 

Y = I^Q + Pixi + P2X2 + • • • + + £ 

This means that the response variable (T) is able to be predicted based on parameter esti¬ 
mates (j8,) times their predictor variables (xj) plus error (e). The technical assumptions of 
a linear regression model are [11]: 

• Errors are normally distributed 

• Errors have constant variance 

• Errors are independent and uncorrelated 

• The model is structurally sound in that the response variables is able to be explained 
by a linear approximation of the predictor variables. 

Eor a linear regression model the following techniques will be utilized in order to verify the 
assumptions [11]: 

• Summary Statistics 

• Residual Values against Eitted Values plot 

• Residual Plot 
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• Normal QQ line plot 

• Cook’s Distance plot 

• Correlation Matrix 


A linear multiple regression model is able to provide predictions by generating parameter 
estimates for each variable within the data set. For those predictions to be accurate, the 
data set utilized within the model should have data that once modeled, is able to withstand 
the assumptions of a linear regression model. Additionally, a linear regression model can 
generate negative predictions, which in the case of recruiting would ultimately correspond 
to zero. In order to further explore additional models a Poisson regression model is also 
utilized. A Poisson regression model is a type of count regression [12]. The following 
section addresses the methodology for selecting a Poisson regression model. 


3.2.2 Poisson Regression 

Poisson regression models are applicable when a small portion of a larger sample is affected 
by a certain event. For example, Poisson regression models have been used successfully to 
model incidences of rare forms of cancer [12]. This reasoning can be applied to military 
recruiting; only a small portion of the qualified public is able to join and only a smaller 
portion of that population actually do join. Since our response variable is count data a 
Poisson regression model may be appropriate. The response variable is the total number 
or total count of accessions for each station. The Poisson distribution varies from zero to 
positive infinity, which does not allow the model to predict negative accessions. Previous 
research suggests Poisson regression models work well with recruit prediction models [4]. 
The technical assumptions of a Poisson model are [12]: 

• Response variable (T) is independent Poisson distribution with mean jii 

* = l5o + 

The first assumption proposes that the response variable has a Poisson distribution. This 
also indicates that for several Y’s, ~ Pois{Y,ilJ-i) [12]. The second assumption means 
that the log of the mean is able to be expressed as a linear combination of the predictor 
values. In order to verify the assumptions of the Poisson model the following verification 
techniques are utilized [12]: 
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• Summary Statistics 

• Estimated Variance against Mean plot 

• Partial residual plots by fitting a general additive model 

• Cook’s Distance plot to check for influential outliers 

• Dispersion Parameter 

The Poisson regression model specifically address count data and a response variable that 
follows a Poisson distribution, which applies directly to military recruiting data. This, along 
with the fact that the Poisson model only generates predictions between zero and positive 
infinity, is why it is of use to this thesis. The next chapter focuses on the models within this 
study and their results. 

3.2.3 Variable Selection 

The variable selection technique we utilized is an analysis of the model’s Akaike’s Infor¬ 
mation Criterion (AIC) value. This value is a statistic that explains how well the model fits 
the data set and penalizes the model for having too many variables. To calculate the AIC 
value the following formula is utilized [11]: 

AIC = n\og{RSS / n) +2p 

where, n is the number of observations, RSS is the residual sum of squares, and p is the 
number of parameters [11]. The model with the lowest AIC is typically the best model to 
utilize since it balances having the simplest model with the best fit to the data [11]. 

To determine the model with the minimum AIC value the dropterm() function in R was 
utilized. This function evaluates how removing each predictor variable one at a time af¬ 
fects the overall AIC value of the model. It then ranks each predictor variable based on 
importance to the model. The model with the lowest AIC value is selected. 
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Table 3.1: Three-Year Model Data Set Description 


Variables 

Definitions 

Impact 

RSID 

Specific recruiting station identification 
number 

utilized as an identity variable 

Population Density 

Measure of how many people live in a square 
mile 

Low pop density means harder to recruit. Requires more resources 
and travel time 

Noble Index Value 

Value assigned to each station based on 
calculations described in the Noble Index 
technical report 

Value above 110 indicates station properly aligned, value 90-110 
indicates station alignment is satisfactory no need for re-aligned and 
value of below 90 indicates a station needs to be realigned. 

QMA to Population 

Percentage of population that is Qualified 

Higher QMA ratio means easier to recruit from area and less resources 

Ratio 

Military Available (QMA) 

utilized. 

Percent HSDG 

Percentage of populations with min 
education of high school diploma 

Low HSDG rates require more resources to find qualified recruits 

Leads Per Capita 

4 year average of national leads divided by 
population of area 

Higher leads indicate interest in military service. Roughly equates 
with propensity to serve. 

Percent Non-White 

Percent of population that is non-white 

Higher means more diverse area, easier for recruiters to find qualified 
diverse populations 

QMA Quality Ratio 

Percentage of QMA estimated within the 
AFQT TSC I-IIIA 

Higher ratio means easier for recruiter to find highly-desired recuits. 

Percent Top 22 DoD 

Market Segments from NMSM with highest 

Higher percentage is likely to produce accession at higher rate. 

Market Segments 

conversion rate (IE. highest rate of 
accessions) 

40-50% of annual accessions come from Top 22 segments. 

Percent Bottom 22 

Market Segments from NMSM with lowest 

Higher percentage means lower accessions rate which requires more 

DoD Market 

Segments 

conversion rate 

resources from recruiters 

Quality Female 

Ratio of quality female QMA count to the 

Increased emphasis from Navy to recruit more Eemale sailors. Higher 

QMA to Total QMA 
Ratio 

total QMA. Estimates percent of QMA that is 
quality female 

ratio means easier for recruiters to find qualified female recruits 

Percent Quality 

Percentage of quality accessions that are 

Higher percentage indicates easier area for recruiters to find quality 

Accessions that are 
Female 

female 

female recruits 

Percent Top 22 

Female Market 
Segments 

Percentage of top 22 market segments from 
NMSM 

Higher percentage means easier for recruiters to find female recruits 

EST to Total Navy 

Ratio of EST’s given to accessions within 

Pseudo measure of level of effort. Higher rate means lower level of 

Accessions Ratio 

geographic area 

effort to filter through making recruiting easier 

75th Percentile 

Max distance 75% of recruits have to travel 

Higher rate means more resources required through driving time for 

Recruit Distance to 
NRS 

to get to Navy Recruiting Station 

both recruits and recruiters 

NRS Distance to 

Distance between Recruiting Station and the 

Higher distance means more resources required to get recruits to 

MBPS 

nearest MEPS 

MEPS. Increased driving time leads to less time actually recruiting 

EMR Competition 
Index 

Indication of ’tightness’ of recruiting market 

Higher index means area is more contested between the services. 

Navy recruiter will require more resources to reach the Navy’s ’fair 
share’ of recruits 

DoD Quality NMSM 

Index from NMSM that compares estimated 

High index tends to have higher proportion of people within best 

Index 

DoD quality accessions rate for geographic 
area compared to national average 

accessing segments. Is easier for recruiters to find someone who will 

access 

DoD Quality Female 

Index from NMSM that compares estimated 

High index means higher proportion of people within best accessing 

Index 

quality female accessions rate for geographic 
area compared to national average 

segments. Easier for recruiters to find a female who will access 

3 Year Average 

Quality ASAD 

Average quality accessions for past 3 years 

High average means easier for recruiters to find quality recruits. High 
average may indicate a need to reduce recruiting resources in area. 

Total Navy 

This is the total number of navy accession for 

This is aggregated from the monthly accession by zip-code data set 

Accessions over the 
past 3 Years 
(TotACCSum) 

the past three fiscal years 

and the response variable within both linear regression models. 

Number of Navy 

Average number of Navy recruiters over the 

This allows the model to account for how many people were dedicated 

Recruiters at an NRS 
(Navy) 

past 12 months at the specific NRS 

to finding recruits for that specific station 
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Table 3.2: Monthly Model Data Set Description 


Variables 

Definitions 

Impact 

RSID 

Specific recruiting station identification number 

utilized as an identity variable 

Population 

Measure of how many people live in a square 

Low pop density means harder to recruit. Requires more 

Density 

mile 

resources and travel time 

QMA to 

Population Ratio 

Percentage of population that is Qualified 

Military Available (QMA) 

Higher QMA ratio means easier to recruit from area and 
less resources utilized. 

Percent HSDG 

Percentage of populations with min education of 

Low HSDG rates require more resources to find qualified 


high school diploma 

recruits 

Leads Per Capita 

4 year average of national leads divided by 

Higher leads indicate interest in military service. 


population of area 

Roughly equates with propensity to serve. 

Percent 

Percent of population that is non-white 

Higher means more diverse area, easier for recruiters to 

Non-White 


find qualified diverse populations 

QMA Quality 
Ratio 

Percentage of QMA estimated within the Armed 
Forces Qualification Test (AFQT) Test Score 
Category (TSC) I-IIIA 

Higher ratio means easier for recruiter to find 
highly-desired recuits. 

Percent Top 22 
DoD Market 

Market Segments from Navy Market 

Segmentation Model (NMSM) with highest 

Higher percentage is likely to produce accession at higher 
rate. 40-50% of annual accessions come from Top 22 

Segments 

conversion rate (IE. highest rate of accessions) 

segments. 


Percent Bottom Market Segments from NMSM with lowest Higher percentage means lower accessions rate which 

22 DoD Market conversion rate requires more resources from recruiters 

Segments 

Quality Female Ratio of quality female QMA count to the total 
QMA to Total QMA. Estimates percent of QMA that is quality 
QMA Ratio female 

Percent Quality Percentage of quality accessions that are female 
Accessions that 
are Female 


Percent Top 22 
Female Market 
Segments 

Percentage of top 22 market segments from 
NMSM 

Higher percentage means easier for recruiters to find 
female recruits 

EST to Total 

Ratio of EST’s given to accessions within 

Pseudo measure of level of effort. Higher rate means 

Navy 

geographic area 

lower level of effort to filter through making recruiting 

Accessions 


easier 

Ratio 




75th Percentile Max distance 75% of recruits have to travel to Higher rate means more resources required through 

Recruit Distance get to Navy Recruiting Station driving time for both recruits and recruiters 

to NRS _ 

NRS Distance to Distance between Recruiting Station and the 
MBPS nearest MEPS 

EMR Indication of ’tightness’ of recruiting market 

Competition 
Index 

DoD Quality Index from NMSM that compares estimated 

NMSM Index DoD quality accessions rate for geographic area 
compared to national average 

DoD Quality Index from NMSM that compares estimated 

Female Index quality female accessions rate for geographic 

area compared to national average 
3 Year Average Average quality accessions for past 3 years 
Quality ASAD 

Total Navy This is the total number of navy accession for the 

Accessions over past three fiscal years 
the past 3 Years 

Month The specific year and month that the accession 

data came from 

Mth this is the month variable as a factor or 

categorical variable without the year included 
NumRecruits This is the total number of accession by month 

and year for a specific NRS 


Higher distance means more resources required to get 
recruits to MEPS. Increased driving time leads to less 
time actually recruiting 

Higher index means area is more contested between the 
services. Navy recruiter will require more resources to 
reach the Navy’s ’fair share’ of recruits 
High index tends to have higher proportion of people 
within best accessing segments. Is easier for recruiters to 
find someone who will access 

High index means higher proportion of people within best 
accessing segments. Easier for recruiters to find a female 
who will access 

High average means easier for recruiters to find quality 
recruits. High average may indicate a need to reduce 
recruiting resources in area. 

This is aggregated from the monthly accession by 
zip-code data set and the response variable within both 
linear regression models. 

this covers a time period and is utilized to separate 
accession data into monthly observations, 
this allows for the model to account for the month that the 
observation took place. 

this is the response variable within this specific data set. 


Increased emphasis from Navy to recruit more Female 
sailors. Higher ratio means easier for recruiters to find 
qualified female recruits 

Higher percentage indicates easier area for recruiters to 
find quality female recruits 
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CHAPTER 4: 
Analysis 


This chapter is broken into two sections. The first section describes the construction and 
comparison of three-year accession models at the station level. Two three-year models are 
analyzed, a univariate linear regression model based on the Noble Index and a multiple 
linear regression model using individual variables from the Noble Index. The response 
variable for each three-year model is the total three-year accessions by station. In the 
second section, three models are analyzed, one univariate linear regression model based on 
the Noble Index, a multiple linear regression model based on individual variables from the 
Noble Index and a Poisson regression model based on individual variables from the Noble 
Index. In each monthly model, the response variable is monthly accessions by station. 
Each section includes a description of the initial model development, variable selection, 
and model results. 


4.1 Three Year Prediction Models 

4.1.1 Initial Model Developments 

Our initial effort built a univariate regression model, called the Noble Index value (NIV) 
model. It utilized only the calculated Noble Index value for each station to generate an 
estimate of the expected number of accessions in a three year period. The response variable 
is the total number of accessions over the past three years per NRS and the independent 
variable is the Noble Index Value for each station. 

In order to provide a comparison to the NIV model, the three-year multiple regression 
model is also built. The response variable, the total number of accessions over the past 
three years, is the same as the NIV model. Initially, all variables within the three-year 
model data set except for the Noble Index value and the RSID variable were considered for 
inclusion in the model. For the complete description of the variables please see Table 3.1. 
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4.1.2 Variable Selection of Initial Models 

Three potential predictor variables are removed from the three-year multiple regression 
model due to the fact that accession data in the response variable are present in the predictor 
variables. Those variables are: 

• ASAD_3YrAvg: Three year average number of quality accessions for all services 
covering FY11-FY13. 

• PerFemaleAccessions_3YrAvg: Percentage of quality accession that are female for 
the past three fiscal years. 

• ESTvAccessions: Ratio of the number of enlisted screening tests given to potential 
accessions and the total number of accessions for that NRS. The EST is a test given 
to potential accessions prior to taking the AEQT that estimates whether a potential 
accession will pass. A recruit who passes the EST is then given the chance to take 
the AEQT. 

Additional variable selection is required in order to determine the significant factors that 
affect long term predictions. As described in chapter three, the technique utilized is an anal¬ 
ysis of Akaike’s Information Criterion (AIC) value. This value is a statistic that explains 
how well the model fits the data set and penalizes the model for having too many variables. 
The model with the lowest AIC is typically the best model to utilize since it balances hav¬ 
ing the simplest model with the best fit to the data [11]. To determine the model with the 
minimum AIC value the dropterm() function in R is utilized. This function evaluates how 
removing each predictor variable one at a time affects the overall AIC value of the model. 
Table 4.1 displays the results of variable selection. 
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Table 4.1: Three-Year Multiple Regression Variable Selection Results 


Variables 

Df SumofSq 

RSS 

AIC 

F Value 

Pr(F) 

PerDiversity 

1 46.415 

1122852.245 

6844.181 

0.039 

0.843 

QMAQualityFemaleRatio 

1 189.983 

1122995.813 

6844.304 

0.160 

0.689 

QMAvPopulation 

1 342.847 

1123148.677 

6844.435 

0.289 

0.591 

PerTop22Female 

1 356.627 

1123162.457 

6844.447 

0.301 

0.583 

PerBot22 

1 608.715 

1123414.545 

6844.664 

0.514 

0.474 

X75thPercentile 

1 692.466 

1123498.296 

6844.736 

0.585 

0.445 

PopulationDensity 

1 822.229 

1123628.059 

6844.847 

0.694 

0.405 

PerTop22 

1 909.400 

1123715.230 

6844.922 

0.768 

0.381 

QMAQualityRatio 

1 1042.632 

1123848.462 

6845.036 

0.880 

0.348 

QualityStationIndex 

1 1168.478 

1123974.307 

6845.144 

0.987 

0.321 

QualityStationIndexFemale 

1 1823.056 

1124628.886 

6845.706 

1.539 

0.215 

PerHSDG 

1 6537.366 

1129343.196 

6849.743 

5.520 

0.019* 

DistanceToMEPS 

1 12190.162 

1134995.992 

6854.561 

10.292 

0.001** 

LeadsPerCapita 

1 50015.542 

1172821.371 

6886.197 

42.229 

1.31E-10*** 

EMR 

1 194688.673 

1317494.502 

6998.445 

164.378 

8.12E-35*** 

Navy 

1 609740.067 

1732545.897 

7262.720 

514.812 

2.22E-91*** 
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The five variables, significant in predicting three-year accessions in order of importance, 
are: 


• Navy: The average number of recruiters over the past 12 months at each NRS 

• EMR: Competition Index indicating how competitive the recruiting market is for 
each specific NRS 

• LeadsPerCapita: Four-year average of National Leads divided by the population of 
the area 

• DistanceToMEPS: Distance between the NRS and the nearest MBPS 

• PerHSDG: Percentage of individuals within the area who have a minimum education 
level of a high school diploma 

4.1.3 Results of Initial Models 

This section provides a description and comparison of the NIV model and the three-year 
multiple regression model, and concludes with a recommendation for their use. The first 
is the result of the Noble Index value model and the second is the result of the three-year 
multiple linear regression model. Following this section, is a comparison of the two models 
to determine which is the better model to be utilize for long-term predictions. To see the 
model validation results please reference Appendix B and Appendix C. 

Noble Index Value Three-Year Model Results 

Table 4.2 contains the summary statistics of the NIV Model. This model accounts for 22% 
of the variance within the response, with an adjusted of 0.22. The residual standard error 
is 45.11; the mean of the response variable is 122 accessions. The standard residual error 
should be lower than the mean of the response variable. For each increase of one unit in 
the Noble Index value assigned to each NRS there is an increase of 1.369 accessions over 
the next three years. 

The NIV model fails to meet two assumptions. The model residuals have signs of het- 
eroscedasticity or non-constant variance and they are not normally distributed. This means 
that the model is only able to be used for point estimation, not for inference of confidence 
intervals or hypothesis testing [13]. To see the full model validation, please see Appendix 
B. The next section contains the results of the three year linear multiple regression model. 
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Table 4.2: Three-Year NIV Model Summary Statistics 


Variable 

Estimate 

Std. Error 

t value 

Pr(>ltl) 

TotACCSum (intercept) 

-18.832 

8.561 

-2.200 

0.028 

NobleIndexV21 

1.369 

0.082 

16.715 

0.000 

Statistic 

Value 




Residual Standard Error 

45.11 on 963 df 




Multiple R-Squared 

0.224 




Adjusted R-Squared 

0.224 




P-Value 

<2.2e-16 





Three-Year Linear Multiple Regression Model Results 

The three-year multiple regression model summary statistics are listed in Table 4.3. The 
final model has five variables that are significant in predicting the total number of three- 
year accessions by station. The adjusted for this model is 0.55, meaning that the model 
is able to account for about 55% of the variance in the response. The residual standard error 
is 34 and the mean of the response variable is 122. 


Table 4.3: Three-Year Multiple Regression Summary Statistics 


Variables 

Estimate 

Std. Error 

t value 

Pr(>ltl) 

TotACCSum (intercept) 

47.078 

17.138 

2.747 

0.006 

Navy 

20.499 

0.874 

23.443 

1.94E-96 

EMR 

-47.497 

3.397 

-13.981 

1.46E-40 

EeadsPerCapita 

2110.280 

215.015 

9.815 

9.87E-22 

DistanceToMEPS 

-0.100 

0.022 

-4.530 

6.64E-06 

PerHSDG 

74.424 

21.445 

3.470 

0.001 


Statistic 

Residual Standard Error 
Multiple R-Squared 
Adjusted R-Squared 
P-Value 


Value 

34.4 on 959 df 
0.5512 
0.5488 
<2.2e-16 


The predictor variable coefficients within the three-year multiple regression model dictate 
how variables affect the number of accessions. The following is an explanation of how 
each coefficient affects the response variable, as predicted by the model. 

• Increasing the average number of Navy recruiters (Navy) by one over the next three 
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years increases total three-year accessions by 20. 

• The LeadsPerCapita variable is misleading because the mean is 0.009, so increasing 
it by one is not realistic. As a scaled example, by increasing the four-year average 
number of national leads divided by the population of the area by 0.009 over the next 
three years results in an increase of 18.99 accessions. 

• PerHSDG is the percentage of individuals with a minimum education level of a high 
school diploma. Increasing this variable by one results in an increase of 74 accessions 
over the next three years. 

Intuitively all of these variables should increase accessions since more recruiters are able 
to recruit more people, a higher interest in joining the Navy should result in more recruits, 
and population with a higher level of education should produce more eligible potential 
enlistees. 

The two predictor variables with negative coefficients are: 

• Increasing the competition index (“EMR”) by one decreases the number of acces¬ 
sions by 47 over the next three years. This may relate to difficulty in recruiting in a 
saturated market. 

• Increasing the distance between an NRS and MBPS by one decreases the number 
of accessions by O.I over the next three years. Recruiters have to drive potential 
accessions from the station to MBPS, this may reduce the amount of time available 
to recruit. 

These estimates are plausible because more time spent in a car and recruiting in a more 
competitive area one would assume will negatively affect accession totals. 

The three-year multiple regression model has signs of heteroscedasticity (non-constant 
variance) in the residuals, and they are not normally distributed, failing the same two as¬ 
sumptions as the NIV model. This model still can be utilized for point estimation, but 
cannot be utilized for inference of confidence intervals or for hypothesis testing. To see the 
full model validation, please refer to Appendix C. Overall, the model is significant and the 
factors within this model should be further explored by recruiting leaders. 
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4.1.4 Comparison of Three-Year Models 

The three-year multiple regression model has a much higher adjusted than the NIV 
model and this means that it has a better explanatory capability than the NIV model. The 
three-year multiple regression model also has a lower standard error than the NIV model, 
which results in more accurate predictions. Additionally, the three-year multiple regression 
model does not utilize prior accession data to generate predictions. The NIV model does 
include these variables. By removing these variables, the model is able to withstand use 
in future years since the policies, personnel, and procedures set forth in recruiting may 
change in the future. Lastly, the three-year multiple regression model is a simpler model 
than the actual Noble Index Value. The NIV model is a univariate model, but in order to 
get that specific value, 18 variables are utilized to calculate it. It’s complex and difficult 
to compute, while the three-year model uses only five factors from the Noble Index data 
set. The best model to utilize for three-year accession production is the three-year multiple 
regression model. Table 4.4 is a summary of both models. 


Table 4.4: Summary 

of Three-Year Models 

NIV 3 Yr Model 

3 Yr Linear Multiple Regression Model 

Adj. R^: 21 

Best 3 Yr accession model 

Does not account for number of recruiters 

Adj. .55 

Fails to meet model assumptions 

Fails to meet model assumptions 

Ability to provide negative predictions 

Ability to provide negative predictions 


For NRD leaders to understand their districts they should investigate the significant vari¬ 
ables within the three-year multiple regression model. The average number of recruiters 
at each station is the most important variable for long-term production. The second most 
important factor is the competition index. NRD leaders should analyze each station’s com¬ 
petition index and ensure they are not investing resources in areas that may not result in 
the best outcome. The third most important variable is the four-year average number of 
national leads divided by the population of an area. NRD leaders should pay attention to 
which areas have the highest LeadsPerCapita value and recruit in those areas, since young 
people in these areas may have a higher interest in joining the Navy. The fourth most im¬ 
portant factor is the distance between a station and MBPS. This may relate to commute 
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times which take away from time spent recruiting. The last variable of significance is the 
percentage of individuals that have a minimum education level of a high school diploma. 
NRD leaders should focus on areas with a high PerHSDG value since it may be easier to 
find qualified applicants in these places. Overall, the three-year multiple regression model 
is a simpler and better model to utilize than the NIV model. 


4.2 Navy Monthly Recruiting Models 

In this section, three monthly models are analyzed: the NIV monthly model, the multiple 
linear regression monthly model, and the Poisson monthly regression model. The response 
variable for all models is the number of monthly accessions by station. The purpose of 
this section is to determine which model is best for predicting monthly accessions and 
to identify key factors that influence monthly accessions. Each section highlights initial 
model development, variable selection, and model results. The Poisson model validation 
is included in this section and model validation for the two linear regression models are 
located in Appendix D and E. 

4.2.1 Poisson Model Development 

A Poisson regression model is used, since the response variable within the monthly model 
data set might reasonably be modeled with the Poisson distribution; the occurrences of 
monthly accessions are small numbers and always non-negative integers. The response 
variable in the Poisson Monthly Model is the number of monthly accessions by station. 
Initially, all variables in the monthly model data set were considered for inclusion in the 
model as predictor variables except: 

• ASAD_3YrAvg: This is the three year average number of quality accessions for all 
services covering PY11-PY13. 

• PerPemaleAccessions_3YrAvg: This is the percentage of quality accession that are 
female for the past three fiscal years. 

• ESTvAccessions: This is the ratio of the number of enlisted screening tests given to 
potential accessions and the total number of accessions for a specific NRS. 

• RSID: This is a classification variables which is the specific recruiting station identi¬ 
fication number. 
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Noble Index Value: This is the single value calculated from the Noble Index for each 
NRS. 


The reasoning for removing these variables prior to variable selection is the same reason 
applied to the three-year multiple regression model previously. Last, the Noble Index value 
variable was removed since it is utilized to form another model later in this chapter. In 
order to determine the best Poisson model, further variable selection was required. 

4.2.2 Variable Selection for Poisson Monthly Model 

An analysis of Akaike’s Information Criterion (AIC) value was conducted for variable 
selection. Additionally, the correlation values between predictor variables are analyzed to 
prevent collinearity. Collinearity exists when two variables utilize similar data to explain 
the relationship between the response and predictor variables [11]. Collinearity affects the 
estimates of the coefficients, so predictor variables that are highly correlated are removed 
from the model. The results of variable selection are in Table 4.5. 


Table 4.5: Monthly Poisson Model Variable Selection Results 


Variable 

Df 

Deviance 

AIC 

LRT 

Pr(Chi) 

PerDiversity 

1 

50003.514 

145943.300 

0.766 

0.381 

QMAQualityFemaleRatio 

1 

50004.057 

145943.843 

1.309 

0.253 

PerBot22 

1 

50005.406 

145945.192 

2.658 

0.103 

PopulationDensity 

1 

50009.159 

145948.945 

6.411 

0.011 

PerTop22Female 

1 

50011.712 

145951.498 

8.964 

0.003 

QMAQualityRatio 

1 

50013.984 

145953.770 

11.236 

0.001 

PerTop22 

1 

50016.672 

145956.458 

13.924 

0.000 

QualityStationIndex 

1 

50016.881 

145956.666 

14.133 

0.000 

QMAvPopulation 

1 

50017.316 

145957.102 

14.568 

0.000 

QualityStationIndexFemale 

1 

50029.955 

145969.741 

27.207 

0.000 

X75thPercentile 

1 

50049.394 

145989.180 

46.646 

0.000 

PerHSDG 

1 

50085.258 

146025.043 

82.510 

0.000 

DistanceToMEPS 

1 

50105.922 

146045.708 

103.175 

0.000 

LeadsPerCapita 

1 

50227.384 

146167.169 

224.636 

0.000 

EMR 

1 

51792.020 

147731.806 

1789.272 

0.000 

Mth 

11 

53071.281 

148991.067 

3068.534 

0.000 

Navy 

1 

54978.240 

150918.026 

4975.492 

0.000 


The variables in Table 4.5 are sorted from least important to most important. We removed 
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PerDiversity, QMAQualityFemaleRatio, and PerBot22 since they are insignificant. Fur¬ 
ther inspection of the variables showed that several variables are highly correlated and 
need to be removed from the model due to collinearity. These variables include, Per- 
Top22, PerTop22Female, QualityStationIndex, QualityStationIndexFemale, QMAvPopu- 
lation, and QMAQualityRatio. Refer to Figure F.l for the Poisson model correlation ma¬ 
trix. All of these variables are important to recruiting but within the model are not useful 
since they are either not statistically significant or correlated with other predictor variables. 
A definition of these variables and explanation of why they are removed is provided. 

• The percentage of individuals within an area that are not Caucasian (PerDiversity) is 
removed due to insignificance within the model. 

• The Ratio of the estimated number of qualify Female QMA to the total number of 
QMA in an area (QMAQualityFemaleRatio) is removed due to insignificance within 
the model. 

• The percentage of the bottom 22 market segments from the NMSM (PerBot22) is 
removed due to insignificance within the model. 

• The percentage of females in the Top 22 market segments of the NMSM (Per- 
Top22Female) is removed due to a correlation value of .98 with the PerTop22 vari¬ 
able. 

• The percentage of individuals in the Top 22 market segments of the NMSM (Per- 
Top22) is removed due to a correlation value of .98 with the PerTop22Female vari¬ 
able. 

• The quality station index value from the NMSM(QualityStationlndex) is removed 
due to a correlation value of .73 with the QualityStationIndexFemale variable. 

• The quality station index female value from the NMSM (QualityStationIndexFe¬ 
male) is removed due to a correlation value of .73 with the QualityStationIndex vari¬ 
able. 

• The percentage of the population that is considered QMA (QMAvPopulation) is re¬ 
moved due to a correlation value of .43 with the QMAQualityRatio variable. 

• The percentage of QMA who are estimated to be in the AFQT TSC I-IIIA 
(QMAQualityRatio) is removed due to a correlation value of .43 with the QMAvPop¬ 
ulation variable. 


32 



4.2.3 Results and Validation of Monthly Poisson Model 

Table 4.6 shows the summary statistics for the Poisson Model. Looking at the summary 
statistics, the Mth variable, which is a categorical variable of the month in which the ob¬ 
servation was taken, is very significant. MthOl is not included in the summary statistics 
because it is the baseline value for the rest of the Mth levels. Each coefficient of the Mth 
variable explains the difference between monthly accessions for that specific month and the 
month of January or MthOl on the log scale. For instance, Mth02 has an estimate of .056; 
this means that for February the model predicts an average increase of a factor of exp ®^^ = 
1.06, or 6%, compared to January with the same attributes. 


Table 4.6: Monthly Poisson Model Summary Statistics 


Variables 

Estimate 

Std. Error 

z value 

Pr(>lzl) 

NumRecruits (Intercept) 

0.6423 

0.0536 

11.9900 

0.0000 

Navy 

0.1602 

0.0022 

73.1565 

0.0000 

PopulationDensity 

0.00002 

0.0000 

4.2411 

0.0000 

PerHSDG 

0.5990 

0.0648 

9.2440 

0.0000 

FeadsPerCapita 

12.2753 

0.5200 

23.6077 

0.0000 

DistanceToMEPS 

-0.0007 

0.0001 

-10.5499 

0.0000 

X75thPercentile 

-0.0023 

0.0003 

-8.5644 

0.0000 

EMR 

-0.4494 

0.0104 

-43.3792 

0.0000 

Mth02 

0.0562 

0.0147 

3.8174 

0.0001 

Mth03 

0.1430 

0.0144 

9.9217 

0.0000 

Mth04 

0.0667 

0.0147 

4.5405 

0.0000 

Mth05 

0.0584 

0.0147 

3.9696 

0.0001 

Mth06 

0.3095 

0.0139 

22.2775 

0.0000 

Mth07 

0.3018 

0.0139 

21.6808 

0.0000 

Mth08 

0.2911 

0.0139 

20.8658 

0.0000 

Mth09 

0.1231 

0.0145 

8.4959 

0.0000 

Mth 10 

-0.1896 

0.0157 

-12.0898 

0.0000 

Mth 11 

-0.0664 

0.0152 

-4.3756 

0.0000 

Mth 12 

-0.1703 

0.0156 

-10.9146 

0.0000 


Statistic Value 

Null Deviance 64656 on 34739 df 
Residual Deviance 50108 on 34721 df 
AIC 146031 
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The parameter estimates which are positive in the summary statistics all seem to be plausi¬ 
ble. Having more recruiters, a higher education level within an area, more people to initially 
recruit from, more leads within a given area, and months close to high school graduation 
all lead to an increase in monthly accessions under the model. The variables with negative 
parameter estimates are interesting. The variables DistanceToMEPS, competition index 
(“EMR”), and X75thPercentile, are all, reasonably enough, associated with decreases in 
monthly accessions under the model. Below is a summary of how each variable affects the 
response variable. 

• Increasing the average number of Navy recruiters by one increases monthly acces¬ 
sions by exp®-^^ = 1.17 , an increase in accessions of 17%. 

• Population density is the number of number of 17-24 year olds per square mile within 
the NRS’s AOR (PopulationDensity). Increasing the population density by one in¬ 
creases monthly accessions by exp® ®®®®^ = 1.00002, an increase in accessions of 
.002% under the model. 

• Increasing the percentage of individuals with a minimum education level of a high 
school diploma (PerHSDG) by one increases monthly accessions by exp®-^^^ = 1.74, 
an increase of 74%. 

• Eeads Per Capita is the four-year average number of national leads divided by the 
population of the area. Increasing this variable by one is not realistic since the ratio 
has a mean value of 0.009 and median value of 0.007. Increasing monthly accessions 
by the mean of 0.009 equates to exp®'^^®^ = 1.11 an 11% increase. 

• The months (Mth) of Eebruary through September have higher predicted monthly 
accessions than January. The months from October through December have lower 
predicted monthly accessions than the month of January. 

• Increasing the distance of the NRS to MEPS (DistanceToMEPS) by one hundred 

miles decreases monthly accessions by = .93, a 7% reduction. More 

time spent in the car by recruiters leads to less time recruiting. 

• Increasing the maximum distance 75% of the recruits have to drive to get to the 
NRS (X75thPercentile) by fifty miles decreases monthly accessions by 11% under 
the model. This may be that more effort is required by the recruits to get to the NRS 
or that the NRS is not located in population centers. 

• Increasing the competition index value (“EMR”) by one decreases monthly acces- 
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sions by exp = .64 or 36%. A more competitive area is harder to obtain recruits 
in. 


It makes sense that increased driving times for both recruiters and recruits and a more 
competitive market have a negative effect on the number of monthly accessions. October 
through December having a lower estimated number of accessions than January also seems 
intuitive since high school seniors have applied to college by this time and are waiting for 
acceptance letters. 


The rest of this section describes the validation of the Monthly Poisson model. The first 
model assumption is to ensure the model is properly dispersed. An estimated dispersion 
parameter (DP) of 1 is optimal for a Poisson model. In this model, the estimate of the DP is 
1.35. This indicates that the model is slightly over-dispersed. This is something that should 
be further investigated, but does not invalidate the model. 


The proportion of deviance explained is 0.225 or 22.5%. This parameter can be thought 
of analogous to the adjusted parameter for a linear regression model. The formula to 
compute the proportion of deviance explained is [12]: 


l-( 


ResidualDeviance 

NullDeviance 


In order to check structural fit of the model, a general additive model was fit and partial 
residual plots were generated for each predictor variable. Please refer to Appendix F, to see 
each plot. An examination of the plots indicates the structure of the model is sound and 
there is no need to transform any of the predictor variables. 

As an aid to checking the variance structure of the model. Figure 4.1 shows the estimated 
variance against the mean. The estimated variance is proportional to the mean, since as 
variance increases so does the mean as indicated by the blue line. Since there are 2,886 
zeros in 35,244 observations, the plot is unusual but the overall trend along the blue line is 
still valid. 

The Cook’s distance plot checks for highly influential observations. An observation with a 
Cook’s distance above .5 is considered an influential outlier. None of the observations are 
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Figure 4.1: Poisson Monthly Model Estimated Variance v. Mean Plot 


Poisson Est Variance v Mean Plot 



influential which makes sense since there are close to 35,000 observations; see Figure 4.2. 

The model is structurally sound, there are no influential outliers, it is slightly over¬ 
dispersed, and the variance structure is sound. This model meets all of its assumptions. 
The downside is the proportion of deviance explained is low; having a higher proportion 
of deviance explained increases the models’ explanatory capability. This model should be 
considered for use by NRC since it is a valid empirically based model. The NIV and the 
multiple linear regression monthly models are now explored in order to see if they are better 
to utilize for short-term predictions. 


4.2.4 Noble Index Value Monthly Model 

In order to evaluate the Noble Index use in predicting monthly production we build a 
monthly regression model using the calculated Noble Index value as the independent vari¬ 
able and monthly accessions by station as the dependent variable. Variable selection is not 
necessary since this is a univariate regression model. The next section is an analysis of its 
results. 
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Figure 4.2: Poisson Monthly Model Cook’s Distance Plot 
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4.2.5 Noble Index Value Monthly Results 

Table 4.7 are the summary statisties for the NIV monthly model. The Noble Index variable 
is significant in predicting monthly accessions with a p-value of essentially zero and an 
adjusted of 0.07. The residual standard error is 2.4, which means when using the least 
squares estimation line for predictions there is an average error of about 2.4 accessions. 
The mean of the response variable is 3.39. By increasing the Noble Index value by one, 
predicted monthly accessions will increase an average of 0.04 per month. This models 
residuals show signs of heteroscedasticity (non-constant variance) and are not normally 
distributed. The model does not do as well as the monthly Poisson model based on its 
explanatory capability and it fails to meet two model assumptions. Additionally, this model 
allows for negative predictions which the Poisson model does not. For the results of the 
model validation see Appendix D. Next is an analysis of the monthly multiple regression 
model. 
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Table 4.7: NIV Monthly Summary Statistics 



Variable 

Estimate 

Std. Error 

t value 

Pr(>ltl) 

NumReeruits (Intereept) 

-0.523 

0.075 

-6.947 

0.000 

NobleIndex_v21 

0.038 

0.001 

52.789 

0.000 

Statistics 

Value 




Residual Standard Error 

2.381 on 34738 df 




Multiple R-Squared 

0.074 




Adjusted R-Squared 

0.074 




P-Value 

<2.2e-16 





4.2.6 Monthly Linear Multiple Regression Model Development 

Since the three-year multiple regression model did better than the three-year NIV model, a 
monthly multiple regression model is built to determine if it is more eapable. The response 
variable is the same as the previous two models; the number of aeeessions per month by sta¬ 
tion. Initially, all variables within the monthly model data set were eonsidered for inelusion 
in the model exeept: 

• ASAD_3YrAvg: This is the three year average number of quality aeeessions for all 
serviees eovering FY11-FY13. 

• PerFemaleAoeessions_3YrAvg: This is the pereentage of quality aeeession that are 
female for the past three fiseal years. 

• ESTvAeeessions: This is the ratio of the number of enlisted sereening tests given to 
potential aeeessions and the total number of aeeessions for a speeifie NRS. 

• RSID: This is a elassifieation variables whieh is the speeifie reeruiting station identi- 
fieation number. 

• Noble Index Value: This is the single value ealeulated from the Noble Index for eaeh 
NRS. 

The same reasoning applied to the three-year regression model, is applied to this model 
with regard to removing these variables prior to variable seleetion. The next seetion further 
explains variable seleetion for the model. 
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4.2.7 Variable Selection for Monthly Linear Multiple Regression Model 

Variable selection is done in the same manner as variable selection for the Poisson monthly 
model, an analysis of the AIC value is conducted. Table 4.8 are the results of variable 
selection. 


Table 4.8: Variable Selection for Monthly Linear Multiple Regression Model 


Variable 

Df 

Sum of Sq 

RSS 

AIC 

F Value 

Pr(F) 

PerDiversity 

1 

1.289 

163218.970 

53803.760 

0.274 

0.601 

QMAQualityFemaleRatio 

1 

5.277 

163222.958 

53804.608 

1.122 

0.289 

QMAvPopulation 

1 

9.524 

163227.204 

53805.512 

2.025 

0.155 

PerTop22Female 

1 

9.906 

163227.587 

53805.594 

2.107 

0.147 

PerBot22 

1 

16.909 

163234.589 

53807.084 

3.596 

0.058 

XVSthPercentile 

1 

19.235 

163236.915 

53807.579 

4.091 

0.043 

PopulationDensity 

1 

22.840 

163240.520 

53808.346 

4.857 

0.028 

PerTop22 

1 

25.261 

163242.941 

53808.861 

5.372 

0.020 

QMAQualityRatio 

1 

28.962 

163246.642 

53809.649 

6.159 

0.013 

QualityStationIndex 

1 

32.458 

163250.138 

53810.393 

6.903 

0.009 

QualityStationIndexFemale 

1 

50.640 

163268.321 

53814.262 

10.770 

0.001 

PerHSDG 

1 

181.594 

163399.274 

53842.115 

38.620 

0.000 

DistanceToMEPS 

1 

338.616 

163556.296 

53875.483 

72.014 

0.000 

LeadsPerCapita 

1 

1389.321 

164607.001 

54097.943 

295.471 

0.000 

EMR 

1 

5408.019 

168625.699 

54935.894 

1150.140 

0.000 

Mth 

11 

10424.371 

173642.051 

55934.281 

201.544 

0.000 

Navy 

1 

16937.224 

180154.904 

57233.445 

3602.091 

0.000 


It is interesting to note that all of the factors deemed insignificant in predicting monthly 
accessions are also insignificant in predicting three-year accessions within the three-year 
multiple linear regression model. The following variables are removed since they are not 
significant in prediction monthly accessions or have high correlation with other predictor 
variables: 

• The percentage of individuals within an area that are not Caucasian (PerDiversity) is 
removed due to insignificance under the model. 

• The ratio of the estimated number of quality Female QMA to the total number of 
QMA in the area (QMAQualityFemaleRatio) is removed due to insignificance under 
the model. 
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• The percentage of the population that is considered QMA (QMAvPopulation) is re¬ 
moved due to insignificance under the model. 

• The percentage of the top 22 market segments for females from the NMSM (Per- 
Top22Female) is removed due to insignificance under the model. 

• The percentage of the bottom 22 market segments from the NMSM (PerBot22) is 
removed due to a correlation value of 0.89 with the PerTop22 variable. 

• The Percentage of individuals in the Top 22 market segments of the NMSM (Per- 
Top22) is removed due to a correlation value of 0.98 with the PerTop22Female vari¬ 
able. 

• The quality station index value from the NMSM (QualityStationIndex) is removed 
due to a correlation value of 0.73 with the QualityStationIndexFemale variable. 

• The quality station index female value from the NMSM (QualityStationIndexFe¬ 
male) is removed due to a correlation value of 0.73 with the QualityStationIndex 
variable. 

4.2.8 Results of Monthly Linear Multiple Regression Model 

The final model has nine variables. Table 4.9 provides the summary statistics for the model. 
The adjusted is 0.23, which means it is able to account for 23% of the variance within 
the response. It has a p-value of <2.2 e-I6 which means that the model is statistically 
significant and has a residual standard error of 2.169. The mean of the response variable is 
3.3. 
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Table 4.9: Summary Statistics for Monthly Linear Multiple Regression Model 


Variable 

Estimate 

Std. Error 

t value 

Pr(>ltl) 

NumReeruits (Intereept) 

1.510 

0.200 

7.537 

0.0000 

Navy 

0.569 

0.009 

61.594 

0.0000 

PopulationDensity 

0.00006 

0.000 

2.759 

0.0058 

PerHSDG 

1.878 

0.244 

7.707 

0.0000 

LeadsPerCapita 

53.471 

2.456 

21.769 

0.0000 

QMAQualityRatio 

-0.523 

0.136 

-3.854 

0.0001 

DistaneeToMEPS 

-0.002 

0.000 

-9.207 

0.0000 

X75thPercentile 

-0.003 

0.001 

-3.073 

0.0021 

EMR 

-1.343 

0.037 

-36.006 

0.0000 

Mth02 

0.179 

0.057 

3.145 

0.0017 

Mth03 

0.477 

0.057 

8.367 

0.0000 

Mth04 

0.214 

0.057 

3.750 

0.0002 

Mth05 

0.187 

0.057 

3.272 

0.0011 

Mth06 

1.125 

0.057 

19.740 

0.0000 

Mth07 

1.093 

0.057 

19.164 

0.0000 

MthOS 

1.048 

0.057 

18.383 

0.0000 

Mth09 

0.406 

0.057 

7.125 

0.0000 

MthlO 

-0.536 

0.057 

-9.397 

0.0000 

Mthll 

-0.199 

0.057 

-3.496 

0.0005 

Mthl2 

-0.486 

0.057 

-8.519 

0.0000 


Statistic 

Residual Standard Error 
Multiple R-Squared 
Adjusted R-Squared 
P-Value 


Value 

2.169 on 34720 df 
0.232 
0.231 
<2.2e-16 
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The following variables listed below all have positive estimates. An explanation of how 
they affect the model is provided. 

• Increasing the average number of Navy recruiters by one increases predicted average 
monthly accessions by 0.569. This may relate to more recruiters generating higher 
numbers of accessions. 

• Increasing the population density of an area by one 17-24 year old person per square 
mile slightly increases predicted average monthly accession by 0.00006. This may 
be due to having a larger initial group of individuals to recruit from. 

• Increasing the percentage of individuals with a minimum level of education of a high 
school diploma increases monthly accession by an average of 1.8 under the model. 
A population which is better educated produces more potential applicants who are 
qualified to join the Navy. 

• Increasing the four year average number of national leads divide by the population 
of the area by one in unrealistic since the mean of this variable is 0.009 and median 
of 0.007. Increasing this variable by its mean of 0.009 equates to an increase of .48 
monthly accessions. 

• The months of February to September all have a higher number of monthly acces¬ 
sions under the model than January. This may be due to the timing of high school 
graduation and high school seniors deciding what their future will entail. 

All of the variables with positive estimates seem to make sense intuitively. The next vari¬ 
ables listed are all variables with negative estimates. 

• Increasing the percentage of QMA who are estimated to be in the AFQT TSC I- 
IIIA (QMAQualityRatio) by one decreases predicted average monthly accessions by 
0.523. This may be due to a higher educated population seeking higher education. 

• Increasing the distance from the NRS to MBPS by one decreases predicted average 
monthly accessions by 0.002. Further driving times for recruiters means less time 
recruiting. 

• Increasing the maximum distance 75% of the recruits have to drive to get to the NRS 
by fifty miles decrease predicted average monthly accessions by 0.15. This may 
equate to more effort required by the recruits to get to the NRS or a station that is not 
located in population centers. 
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• Increasing the eompetition index value by one deereases predicted average monthly 
aceessions by 1.343. A more eompetitive market makes reeruiting harder in that area. 

• The months of October through Deeember have fewer monthly aeeessions than Jan¬ 
uary. 

The variables with negative estimates intuitively make sense since they deal with more 
competitive recruiting areas, distanees associated with getting to the MBPS, and distances 
reeruits travel to get to the reeruiting station. The QMAQualityRatio variable is interest¬ 
ing because inereasing this value should indieate a higher ratio of qualified applicants will 
do well on the AFQT and in turn be able to aceess. Within the model, this variable neg¬ 
atively affeets aceessions which seems eounter-intuitive. Overall the model is signifieant 
in predieting monthly aceessions and has an explanatory power equivalent to the Poisson 
monthly model. 

This model fails to meet two assumptions. This model has signs of heteroseedastieity (non¬ 
constant variance) and is not normally distributed. This model is not able to be utilized 
for inferenee of eonfidenee intervals or hypothesis testing sinee it fails to meet the model 
assumptions but ean be utilized for point estimation [13]. To see the full model validation, 
please reference Appendix E. 

4.2.9 Comparison of Monthly Models 

All monthly models are signifieant in predieting monthly accessions. The NIV monthly 
model and monthly linear multiple regression model fail the same two model assumptions 
of eonstant varianee and normal distribution. Additionally, the two models allow for neg¬ 
ative predietions, whieh in reality is not possible. The Poisson model meets all model 
assumptions and is comparable with the linear multiple regression model for the best ex¬ 
planatory eapability. 

There are several factors that are common to both the Poisson monthly model and the 
monthly linear multiple regression model. The faetors that are eommon to both models 
are: 


• Having a higher average number of Navy reeruiters increases monthly accessions in 
both models. This is obvious sinee an inerease in manpower resourees equates to 
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more work able to be completed. 

• Having a more educated population within the area increases monthly accession in 
both models. A more educated populace may increase monthly accessions by reduc¬ 
ing the time it takes to find qualified individuals. 

• Having more national leads within a given area increases monthly accessions in both 
models. Having a higher number of national leads may equate to more interest in 
joining the Navy. 

• February through September generally will produce more accessions than January. 
October through December will generally produce fewer accessions than January. 
This make sense since it is around the time that high school seniors are getting ready 
to decide what they want to do with their future. 

• A greater distance from the NRS to MBPS decrease monthly accessions within both 
models. This can be attributed to longer driving distances which results in less time 
actually recruiting. 

• Having a greater maximum distance 75% of the recruits have to travel to get to the 
NRS negatively affects monthly accessions in both models. The reason for this may 
be that having a recruiting station that is harder to get to requires more effort from the 
recruits. This also may be attributed to having an NRS located outside of population 
centers. 

• A higher competition index negatively affects monthly accession in both models. A 
higher competition index equates to a more competitive recruiting environment. 


The variables listed above should be included in any model selected since they are signifi¬ 
cant in both the monthly Poisson model and the monthly multiple linear regression model. 
Last, the best model for analysis of monthly production is the Poisson model since it meets 
all model assumptions, has one of the best explanatory capabilities, and accounts for a 
small number of occurances in the response variable. Table 4.10 is a summary of all three 
monthly models. The next chapter highlights recommendation for use of the models and 
recommendations for future work. 
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Table 4.10: Summary of Monthly Predictive Models 


Poisson Monthly 

Model 

NIV Monthly Model 

Monthly Linear Multiple 
Regression Model 

Best model for use in 

Make Goal game 

Adj. P}: 0.07 

Adj. R^: 0.23 

Proportion of Dev 
explained: .225 

Does not account for number 

of recruiters 

Fails to meet model 

assumptions 

Meets all model 

assumptions 

Fails to meet model 

assumptions 

Ability to provide negative 
predictions 

Does not allow for 

negative predictions 

Has lowest explanatory power 
of all monthly models 
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CHAPTER 5: 
Summary and Conclusions 


Chapter five is separated into two sections. The first section is a brief explanation of the 
results of each model, as well as the final recommendation as to which models to utilize 
for short- and long-term analysis and predictions. The second section is future work that 
should be completed in order to enhance the capability of the models and suggestions for 
improvements to the Make Goal game. 

5.1 Recommendations 

The purpose of developing several models from the data provided by NRC within this thesis 
is to determine which is the most capable in generating predictions, the most appropriate 
to use for short-term and long-term analysis by NRC N5, and which assists in meeting the 
training objectives set forth by NRC N7. By determining the most capable model, NRC 
N5 is able to further investigate the most important factors that influence short-term and 
long-term accessions. By implementing one of these models into Make Goal, NRD leaders 
will gain experience in determining resource allocation strategies that best fit within their 
specific district and better understand the recruiting environment they work in. 

For the three-year models, both the NIV and the multiple linear regression models are 
significant in predicting the total number of accessions for three years by NRS. The NIV 
model has less explanatory power and a higher standard error than the multiple regression 
model. Both models fail to meet the assumptions of constant variance and normal dis¬ 
tribution but are still able to be utilized for point estimation. Overall, since the multiple 
regression model has better explanatory power and lower residual standard error, it is the 
model that should be utilized for long-term analysis given the current data. 

For the monthly models, the NIV monthly model is the least capable with regard to ex¬ 
planatory capability due to the fact that it has an adjusted of 0.07 and a standard error 
of 2.381. Additionally, this model fails to meet the same two assumptions as the previous 
models. The multiple linear regression model has a better explanatory capability compared 
to the NIV monthly model with an adjusted of 0.23 and a standard error of 2.169. This 


47 




model also fails to meet the same two assumptions. The Poisson model has equivalent ex¬ 
planatory eapability to the monthly multiple linear regression model, with the proportion of 
devianee explained equal to 0.225. Additionally, this model meets all assumptions, it ean 
be utilized for inferenee of eonfidenee intervals, hypothesis testing and point estimation. 
For these reasons, the most appropriate monthly model to utilize for short term analysis 
and implementation into Make Goal is the Poisson monthly model. Table 5.1 is a summary 
of the models and reeommendations for use. 


Table 5.1: Consolidated Summary of All Predictive Models 


NIV 3 Yr 

Model 

3 Yr Linear 

Multiple 

Regression 

Model 

Poisson 

Monthly 

Model 

NIV Monthly 

Model 

Monthly 

Linear 

Multiple 

Regression 

Model 

Adj. R^: 0.22 

Best 3 Yr 

aeeession 

model 

Best model 

for use in 

Make Goal 

game 

Adj. R^: 0.07 

Adj. R^: 0.23 

Does not 

Adj. R^: 0.55 

Proportion of 

Does not aeeount 

Fails to meet 

aeeount for 


Dev 

for number of 

model 

number of 

reeruiters 


explained: 

0.225 

reeruiters 

assumptions 

Fails to meet 

Fails to meet 

Meets all 

Fails to meet 

Ability to 

model 

model 

model 

model 

provide 

assumptions 

assumptions 

assumptions 

assumptions 

negative 

predietions 

Ability to 

Ability to 

Does not 

Has lowest 


provide 

provide 

allow for 

explanatory 


negative 

negative 

negative 

power of all 


predietions 

predietions 

predietions 

monthly models 



The most important variables from the data available that affeet monthly and three-year 
produetion at the station level are: 
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• The average number of Navy recruiters over last 12 months at each NRS. This is the 
most important variable in all models. This may equate to work capacity. 

• The calendar month accounts for when a person accesses. The months of February to 
September have more predicted accessions than January. The Months of October to 
December have less predicted accessions than January (Only included in short term 
predictions). 

• A Higher competition index value equates to less accessions. This may indicate areas 
with more competition are harder to recruit from. 

• A higher number of the four-year average of national leads divided by population of 
an area results in more accessions. This may equate to the level of interest an area 
has with regard to joining the Navy. 

• The further the distance from an NRS to MBPS results in lower accessions. This may 
equate to a reduction in time spent recruiting. 

• A higher percentage of individuals with a minimum education level of a high school 
diploma results in more accessions. This may equate to less time spent finding qual¬ 
ified applicants. 

NRD leaders should consider these factors when setting goals and allocating resources 
to stations within their districts. NRD leaders should be trained to attend to these data 
elements, as is intended with the Make Goal game. 

5.2 Future Work 

The models produced in this research could potentially be improved through the develop¬ 
ment of a zero-inflated Poisson model to account for the large number of zeroes in the re¬ 
sponse variable. Further research of different factors could be explored in order to increase 
the models explanatory capability, these include NRS proximity to military installations, 
median income, college attendance rates, and unemployment rates. 

Last, Navy recruiters have a difficult job locating accessions to fill billets such as Navy 
Seals, Explosive Ordinance Detection technicians, and jobs within the nuclear operations 
field. NRD leaders have to spend significant time and manpower in locating and recruiting 
individuals qualified for these jobs. Conducting analysis to identify the factors that con¬ 
tribute to the accessions fo these low density specialty billets would greatly benefit NRC. 
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APPENDIX A: 

Enlisted Market Report Description 


This data and description was provided by NRC, specifically Mr. Robby Powell, NRC N5. 


Variable 

Description 

Explanation 

Navy Recruiting District ID Number 

Specific numeric code identifies the Dis¬ 
trict a station falls under 

NA 

Recruiting Station ID 

Specific numeric code that identifies indi¬ 
vidual recruiting stations 

NA 

Recruiting Station Name 

Name of the recruiting station 

NA 

Competition Index 

Specific percentage that is calculated to 
determine how competitive an area is 
with regard to other services recruiting 
goals within the area. 

The index if higher than 110%, the area 
is very competitive. An area with lower 

than 90% means there is sufficient room 

for the recruiter to obtain recruits. 

DoD Total Accessions for PYl 

Total number of accessions from previous 
year 1 ( FY2013) for the Department of 

Defense as a whole 

Shows previous production within a 
given stations area 

Total Quality DoD Upper Accessions for 

PYl 

Total number of accessions from previous 
year 1 (FY 2013) with an AFQT score of 

50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 

Total Army Upper Accessions for PYl 

Total number of accessions for the Army 
from previous year 1 (FY2013) with an 
AFQT score of 50 -h 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 

Total Navy Upper Accessions for PYl 

Total number of accessions for the Navy 
from previous year 1 (FY 2013) with an 
AFQT score of 50 -h 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 

Total Air Force Upper Accessions for 

PYl 

Total number of accessions for the Air 

Force from previous year 1 (FY 2013) 
with an AFQT score of 50-H 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 

Total Marine Upper Accessions for PYl 

Total number of accessions for the Ma¬ 
rine Corps from previous year 1 (FY 
2013) with an AFQT score of 50 -h 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 

Total DoD Accessions for PY2 

Total number of accessions from previous 
year 2 (FY 2012) with an AFQT score of 
50 -h for the entire Department of Defense 

Shows previous production within a 
given stations area 

Total DoD Upper Accessions for PY2 

Total number of accessions from previous 
year 2 (FY 2012) with an AFQT score of 
50 -h for the entire Department of Defense 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50 -h on the AFQT. 
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Variable 

Description 

Explanation 

Total Army Upper Accessions for PY2 

Total number of accessions for the Army 
from previous year 2 (FY 2012) with an 
AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Navy Upper Accessions for PY2 

Total number of accessions for the Navy 
from previous year 2 (FY 2012) with an 
AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Air Force Upper Accessions for 

PY2 

Total number of accessions for the Air 

Force from previous year 2 (FY 2012) 
with an AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Marine Upper Accessions for PY2 

Total number of accessions for the Ma¬ 
rine Corps from previous year 2 (FY 
2012) with an AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total DoD Accessions for PY3 

Total number of accessions from previous 
year 3 (FY 2011) for the entire Depart¬ 
ment of Defense 

Shows previous production within a 
given stations area 

Total DoD Upper Accessions for PY3 

Total number of accessions from previous 
year 3 (FY 2011) with an AFQT score of 
50+ for the entire Department of Defense 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Army Upper Accessions for PY3 

Total number of accessions for the Army 
from previous year 3 (FY 2011) with an 
AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Navy Upper Accessions for PY3 

Total number of accessions for the Navy 
from previous year 3 (FY 2011) with an 
AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Air Force Upper Accessions for 

PY3 

Total number of accessions for the Air 

Force from previous year 3 (FY 2011) 
with an AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Total Marine Upper Accessions for PY3 

Total number of accessions for the Ma¬ 
rine Corps from previous year 3 (FY 
2011) with an AFQT score of 50+ 

Shows previous production within a 
given stations area for desirable recruits. 
Desirable because they have a score of 
50+ on the AFQT. 

Navy RFMIS Authorized Seats 

Number of Navy billets designed specifi¬ 
cally for this station to have 

This is not the number of actual recruiters 

or the number of allocated recruiter bil¬ 
lets. This number can reflect command 

positions within the recruiting field as 

well 

Army Recruiters 

Average number of Army recruiters over 
the past 12 months 

Gives an average on how many recruiters 
are actually recruiting within this spe¬ 
cific area. Allows for comparison of how 
many recruiters are in each area 
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Variable 

Description 

Explanation 

Navy Recruiters 

Average number of Navy recruiters over 
the past 12 months 

Gives an average on how many recruiters 
are actually recruiting within this spe¬ 
cific area. Allows for comparison of how 
many recruiters are in each area 

Air Force Recruiters 

Average number of Air Force recruiters 
over the past 12 months 

Gives an average on how many recruiters 
are actually recruiting within this spe¬ 
cific area. Allows for comparison of how 
many recruiters are in each area 

Marine Recruiters 

Average number of Marine Corps re¬ 
cruiters over the past 12 months 

Gives an average on how many recruiters 
are actually recruiting within this spe¬ 
cific area. Allows for comparison of how 
many recruiters are in each area 

Current High School Senior Count Male 

Total number of high school seniors who 
are male within the specific station’s area. 

Provides some insight into how many po¬ 
tential recruits are available in an area. 

As a baseline this number can be doubled 

if taking into account the number of fe¬ 
males as well. 

Current 17-21 Count Male 

Number of 17-21 year olds within a spe¬ 
cific stations area 

Provides some insight into how many po¬ 
tential recruits are available in an area. 

Current HS Seniors + 17-21 Count 

Total number of high school seniors as 
well as 17-21 year olds within a specific 

stations area 

Gives an overall picture of the number of 
potential recruits within an area. 

Current A Cell Non-prior Service Males 

Total number of individuals who have an 

AFQT score of 50+ as well as a High 
School Diploma Graduate (HSDG) for all 
males who have not previously served in 
the armed forces within a specific stations 

area 

Provides insight into how many desirable 
potential recruits there are in specific sta¬ 
tions area. 

Current A-Cell White Male 

Total number of individuals who have an 

AFQT score of 50+ as well as a High 
School Diploma Graduate (HSDG) for 
all White males within a specific stations 

area 

Provides more focused demographic in¬ 
formation on potential recruits there are 
in specific stations area. Impact to re¬ 
cruiting a more diverse population 

Current A Cell Black Male 

Total number of individuals who have an 

AFQT score of 50+ as well as a High 
School Diploma Graduate (HSDG) for 
all Black males within a specific stations 

area 

Provides more focused demographic in¬ 
formation on potential recruits there are 
in specific stations area. Impact to re¬ 
cruiting a more diverse population 

Current A Cell Hispanic Male 

Total number of individuals who have an 

AFQT score of 50+ as well as a High 
School Diploma Graduate (HSDG) for all 
Hispanic males within a specific stations 

area 

Provides more focused demographic in¬ 
formation on potential recruits there are 
in specific stations area. Impact to re¬ 
cruiting a more diverse population 

Current A Cell Asian-Pacific Islander 

Male 

Total number of individuals who have an 

AFQT score of 50+ as well as a High 
School Diploma Graduate (HSDG) for 

all Asian-Pacific Islander males within a 

specific stations area 

Provides more focused demographic in¬ 
formation on potential recruits there are 
in specific stations area. Impact to re¬ 
cruiting a more diverse population 
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Variable 

Description 

Explanation 

Current 22-29 population Male 

Total number of 22-29 year olds within a 
specific stations area 

Provides insight into how many potential 
recruits there are in specific stations area. 

Current Total Male 17-21 + 22-29 

Total number of 17-29 year olds within a 
specific stations area 

Provides insight into how many potential 
recruits there are in specific stations area. 

Websteam Division ID 

Specific numeric code that delineates su- 
pervisors/chain of command for the spe¬ 
cific stations 

has no impact to recruiting 

Current Prior Service Total 

Total number of prior service individuals 
within a specific stations area 

Impacts recruiting because it has been 
shown that having a higher number of 
prior service individuals within an area 
leads to higher enlistment in that area 

PY1 Selective Service Total 

Total number of individuals who have 

signed up for selective service within a 
specific stations area for previous year 1 
(FY 2013) 

Provides insight into how many potential 
recruits there are in specific stations area. 

This number includes females and others 

who are not required to sign up for selec¬ 
tive service 

PY1 Selective Service White 

Total number of white individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 1 (FY 2013) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY 1 Selective Service Black 

Total number of Black individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 1 (FY 2013) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY 1 Selective Service Hispanic 

Total number of Hispanic individuals 
who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 1 (FY 2013) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PYl Selective Service Asian Pacific Is¬ 
lander 

Total number of Asian Pacific Islander in¬ 
dividuals who have signed up for selec¬ 
tive service within a specific stations area 
for previous year 1 (FY 2013) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY 1 Selective Service Other 

Total number of ’other’ race individuals 

who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 1 (FY 2013). ’Other’ means any 

races not listed above. 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 
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Variable 

Description 

Explanation 

PY2 Selective Service Total 

Total number of individuals who have 

signed up for selective service within a 
specific stations area for previous year 2 
(FY 2012) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required to 
sign up. 

PY2 Selective Service White 

Total number of White individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 2 (FY 2012) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY2 Selective Service Black 

Total number of Black individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 2 (FY 2012) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY2 Selective Service Hispanic 

Total number of Hispanic individuals 
who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 2 (FY 2012) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY2 Selective Service API 

Total number of Asian Pacific Islander in¬ 
dividuals who have signed up for selec¬ 
tive service within a specific stations area 
for previous year 2 (FY 2012) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY2 Selective Service Other 

Total number of ’other’ race individuals 

who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 2 (FY 2012). ’Other’ means any 

races not listed above. 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY3 Selective Service Total 

Total number of individuals who have 

signed up for selective service within a 
specific stations area for previous year 3 
(FY 2011) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required to 
sign up. 

PY3 Selective Service White 

Total number of White individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 3 (FY 2011) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 
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Variable 

Description 

Explanation 

PY3 Selective Service Black 

Total number of Black individuals who 

have signed up for selective service 
within a specific stations area for previ¬ 
ous year 3 (FY 2011) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY3 Selective Service Hispanic 

Total number of Hispanic individuals 
who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 3 (FY 2011) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY3 Selective Service API 

Total number of Asian Pacific Islander in¬ 
dividuals who have signed up for selec¬ 
tive service within a specific stations area 
for previous year 3 (FY 2011) 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

PY3 Selective Service Other 

Total number of ’other’ race individuals 

who have signed up for selective service 
within a specific stations area for previ¬ 
ous year 1 (FY 2013). ’Other’ means any 

races not listed above. 

Provides further demographic insight into 
how many potential recruits are within a 
specific stations area. This number does 
include females and others not required 
to sign up. Impact to recruiting a more 

diverse force. 

Market Segmentation Youth Based Count 
(17-24) 

Estimated number of 17-24 year olds 
within a specific stations area. 

Number comes from the 2010 US census. 

Provides further insight into how many 
potential recruits are within a specific sta¬ 
tions ara. 

Count of 17-24 yo in Navy’s best Market 
Segments 

Estimated number of 17-24 year olds 
within a specific stations area who fall 
into the Navy’s top 22 market seg¬ 
ments from the Navy Market Segmenta¬ 
tion model 

Number is calculated from the percentage 
of the area’s population that became Test 
Score Category I-IIIA for FY 2013. Pro¬ 
vides an method to see how many people 
from a specific area will come from seg¬ 
ments that produce at higher rates. 

PY1 Local Leads 

Total number of local leads from previ¬ 
ous year 1 (FY 2013) within a specific 

stations area 

Local leads are individuals that have in¬ 
quired about the Navy from local adver¬ 
tising campaigns. 

PYl National Leads 

Total number of national leads from pre¬ 
vious year 1 (FY 2013) within a specific 

stations area 

National leads are individuals that have 

inquired about the Navy from the national 
advertising campaign. Zip codes are uti¬ 
lized to identify which station to send the 

national leads to. 

PY2 Local Leads 

Total number of local leads from previ¬ 
ous year 2 (FY 2012) within a specific 

stations area 

Local leads are individuals that have in¬ 
quired about the Navy from local adver¬ 
tising campaigns. 


56 




Variable 

Description 

Explanation 

PY2 National Leads 

Total number of national leads from pre¬ 
vious year 2 (FY 2012) within a specific 

stations area 

National leads are individuals that have 

inquired about the Navy from the national 
advertising campaign. Zip codes are uti¬ 
lized to identify which station to send the 

national leads to. 

PY3 Local Leads 

Total number of local leads from previ¬ 
ous year 3 (FY 2011) within a specific 

stations area 

Local leads are individuals that have in¬ 
quired about the Navy from local adver¬ 
tising campaigns. 

PY3 National Leads 

Total number of national leads from pre¬ 
vious year 3 (FY 2011) within a specific 

stations area 

National leads are individuals that have 

inquired about the Navy from the national 
advertising campaign. Zip codes are uti¬ 
lized to identify which station to send the 

national leads to. 

Current Enlisted Recruiters assigned 

from NRC PSR 

Number of enlisted recruiters assigned to 
the specific stations as of February 2014. 

Provides insight into how many recruit¬ 
ing assets are in the specific stations area 

PY1 High School Master File Total 

Total number of high school seniors from 
previous year 1 (FY 2013) within a spe¬ 
cific stations area 

Provides insight into how large the future 
recruit able population is within a specific 

stations area is. 

PYl HSMF Total White 

Total number of White high school se¬ 
niors from previous year 1 (FY 2013) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PYl HSMF Total Black 

Total number of Black high school se¬ 
niors from previous year 1 (FY 2013) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PYl HSMF Total Hispanic 

Total number of Hispanic high school se¬ 
niors from previous year 1 (FY 2013) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PYl HSMF Total API 

Total number of Asian Pacific Islander 

high school seniors from previous year 1 
(FY 2013) within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PYl HSMF Total Other 

Total number of ’other’ races high school 
seniors from previous year 1 (FY 2013) 
within a specific stations area. ’Other’ 

means other races not listed above. 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PY2 HSMF Total 

Total number of high school seniors from 
previous year 2 (FY 2012) within a spe¬ 
cific stations area 

Provides insight into how large the future 
recruit able population is within a specific 

stations area is. 

PY2 HSMF Total White 

Total number of White high school se¬ 
niors from previous year 2 (FY 2012) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 
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Variable 

Description 

Explanation 

PY2 HSMF Total Black 

Total number of Black high school se¬ 
niors from previous year 2 (FY 2012) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PY2 HSMF Total Hispanic 

Total number of Hispanic high school se¬ 
niors from previous year 2 (FY 2012) 
within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PY2 HSMF Total API 

Total number of Asian Pacific Islander 

high school seniors from previous year 2 
(FY 2012) within a specific stations area 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

PY2 HSMF Total Other 

Total number of ’other’ races high school 
seniors from previous year 2 (FY 2012) 
within a specific stations area. ’Other’ 

means other races not listed above. 

Provides demographic insight into how 
large the future recruit able population is 
within a specific stations area is. Impact 
to recruiting a more diverse force. 

NRD Name 

Specific District name that the station 

falls under 

NA 
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APPENDIX B: 

Noble Index Value Model Validation 


The purpose of this appendix is to provide the reader with model verifieation of the Three- 
Year Noble Index Value Model. All teehniques listed in ehapter three are in the following 
appendix. The first model assumption to eheek is the residuals show eonstant varianee. 
Figure B.l shows this plot is the residuals v. fitted values plot. To show eonstant varianee, 
the observations should be seattered symmetrieally vertieal around zero. This plot shows 
the residuals have a eone shape, whieh is indieative of heteroseedastieity (non-eonstant 
varianee). 


NIV 3 Yr Model Residual v Fitted Vaiues 



Fitted 


Figure B.l: Three-Year NIV Model Residual v Fitted Values 


Figure B.2 is the normal Q-Q plot. It is utilized to eheek for normality. If the residuals are 
normally distributed the points should follow along the line in the plot. From looking at 
the plot, the residuals tend to stray away from the line towards the latter part of the plot. 
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The reason for this is due to the number of observations within the data set and the variance 
within the residuals. This model is not normally distributed. 

NIV 3 Yr Model Normal Q-Q Plot 


o 



Theoretical Quantiles 

Figure B.2: Three-Year NIV Model Normal Q-Q Plot 

To ensure the residuals are uncorrelated, a residual plot is generated in Figure B.3. For 
residuals to be uncorrelated, the plot should show points scattered evenly along the hori¬ 
zontal line. It is evident that the residuals are independent and uncorrelated. If this were 
not the case then there would be long runs of observations above or below the line [11]. 

In addition to this plot, a correlation matrix is provided. Figure B.4 is the correlation matrix 
and shows the Noble Index Value has a correlation value of 0.47. This does not indicate that 
the residuals are correlated. A score close to one or negative one indicates high correlation. 


In order to check for influential observations a Cook’s Distance Plot is utilized. The Cook’s 
Distance plot allows the user to identify observations that affect the rest of the model. A 
Cook’s distance value above 0.50 indicates an influential observation and should be further 
investigated on whether to remove the observation. Figure B.5 below shows the Cook’s 
Distance Plot. There are no signs of influential observations. 

The last validation is the structural check of the model. In order to verify the structure of 
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NIV 3Yr Model Residual plot 
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Figure B.3: Three-Year NIV Model Residual Plot 


Correlation plot 



Figure B.4: Three-Year NIV Model Correlation Matrix 


the model, a partial residual plot is ereated. This allows for analysis of the relationship 
between the response variable and each predictor. Analysis of the plots determine whether 
the predictor variable needs to be transformed in order to better fit the model. Figure B.6 is 
the partial residual plot, transformation of the predictor variable is not needed. 
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Figure B.5: Three-Year NIV Model Cook’s Distance Plot 
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Figure B.6: Three-Year NIV model Partial Residual Plot 
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APPENDIX C: 

Three-Year Multiple Regression Model Validation 


The purpose of this appendix is to provide model validation of the Three-Year multiple 
regression model. The first assumption to investigate is eonstant varianee. Figure C.l is 
the residual v. fitted plot. Again, residuals have eonstant varianee if the points are seattered 
symmetrieally vertieal around zero. This plot has a eone shape whieh is indieative of 
heteroseedastieity (non-eonstant varianee). 


3 Year Multiple Regression Residual v Fitted Values 
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Figure C.l: Three-Year Multiple Regression Model Residual v Fitted Plot 


The next assumption is the model residuals are normally distributed. To eheek this, a 
Normal Q-Q plot is generated. Figure C.2 is the Normal Q-Q plot, the residuals seem to 
stray off the line at the beginning of the plot and at the end of the plot, this is indieative of 
non-normality and the model fails this assumption. 

The next assumption to eheek is the residuals are uneorrelated. Figure C.3 is the residual 
plot for this model. If the residuals are uneorrelated then the observations should be seat¬ 
tered along the horizontal line within the plot. It seems from this plot that the residuals are 
uneorrelated. 
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3 Year Multiple Regression Normal Q-Q Plot 
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Figure C.2: Three-Year Multiple Regression Model Normal Q-Q Plot 


3 Year Multiple Regression Residual plot 
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Figure C.3: Three-Year Multiple Regression Residual Plot 


In order to further ensure the residuals are uncorrelated, a correlation matrix is provided 
in Figure C.4. Within the matrix the darker the color blue, the higher the correlation is 
between two variables. There is no need to remove any of the variables based on the 
correlation matrix. 
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Correlation plot 
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Figure C.4: Three-Year Multiple Regression Model Correlation Matrix 


The Cook’s Distance plot is generated in order to see if any observations are highly influen¬ 
tial. An observation is considered highly influential if it has a Cook’s Distance value above 
0.50. Figure C.5 confirms there are no influential observations within the model. 


3 Year Multiple Regression Cook's Distance Plot 
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Figure C.5: Three-Year Multiple Regression Model Cook's Distance Plot 


Last, is the structural validation of the model. Partial Residual plots of each predictor vari¬ 
able are analyzed in order to determine if any transformation of the variables are required 
to improve the model. These plot are in Figure C.6 to Figure C.IO. Based on these plots 
there is no need to transform any of the variables. 
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Figure C.6: Average Number of Recruiters Partial Residual Plot for Three-Year Multiple Regres¬ 
sion Model 
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Figure C.7: Percentage of Fligh School Diploma Graduates Partial Residual Plot for Three-Year 
Multiple Regression Model 
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C.8: LeadsPerCapita Partial Residual Plot for Three-Year Multiple Regression Model 
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Figure C.9: DistanceToMEPS Partial Residual Plot for Three-Year Multiple Regression Model 
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Figure C.IO: Competition Index ("EMR") Partial Residual Plot for Three-Year Multiple Regres¬ 
sion Model 
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APPENDIX D: 

Noble Index Value Monthly Model Validation 


The first assumption to check is constant variance. Figure D.l shows the residual values 
versus the fitted values plot. For constant variance to be present, the residuals should be 
scattered symmetrically vertical around zero. This plot has a significant cone shape and has 
signs of heteroscedasticity. This model fails to meet the assumption of constant variance. 


NIV Monthly Model Residual v Fitted Values 



Figure D.l: NIV Monthly Model Residual v Fitted Plot 


The next assumption to check is the residuals are normally distributed. Figure ?? is a 
display of the normal Q-Q plot which allows for an inspection of normality. The residuals 
should follow along the black line. This is not the case and this model fails the assumption 
that the residuals are normally distributed. 

The next check is to ensure the residuals are uncorrelated. To do this an inspection of the 
residual plot is performed. Figure D.3 is the residual plot. This plot is somewhat useful 
but very cluttered since there are around 35,000 observations. From looking at this plot is 
seems that the residuals are uncorrelated. 

In order to further ensure the residuals are uncorrelated a correlation matrix is generated. 
Figure D.4 shows the correlation matrix. There is no sign of correlation within this model. 
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NIV Monthly Model Normal Q-Q Plot 
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Figure D.2: NIV Monthly Model Normal Q-Q Plot 


NIV Monthly Model Residual Plot 
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Figure D.3: NIV Monthly Model Residual Plot 


In order to determine if the model has any influential outliers that are affecting the model, 
a Cook’s Distance plot is generated. Any observation with a Cook’s Distance value greater 
than 0.50 is considered influential. Figure D.5 shows that none of the observations are 
influential. 

The last assumption to check is that the model is structurally sound. By looking at the 
partial residual plot in Figure D.bthis model is structurally sound and there is no need to 
transform the predictor variable. 
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NIV Monthly Correlation Matrix 
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Figure D.5: NIV Monthly Model Cook’s Distance Plot 
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Figure D.6: Monthly NIV Partial Residual Plot 
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APPENDIX E: 

Monthly Linear Multiple Regression Model Validation 


The purpose of this appendix is to provide the reader the ability to see the model validation 
of the Monthly Linear Multiple Regression Model. The first assumption to validate is 
that the residuals have eonstant varianee. The residual values versus fitted values plot is 
generated in Figure E. 1. It shows that the residuals have signs of heteroseedastieity and 
violate the first assumption. 


Monthly Mult Reg Residual v Fitted Vaiues 
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Figure E.l: Monthly Linear Multiple Regression Model Residual v Fitted Plot 

The assumption that the residuals are normally distributed is the next assumption to eheek. 
The normal Q-Q plot is generated in Figure E.2. For the residuals to be normally dis¬ 
tributed, they should follow along the line within the plot. Figure E.2 shows that this model 
is not normally distributed and violates this assumption. 

The Residual plot is eonstrueted in order to eheek for eorrelation between the residuals. 
Figure E.3 shows the residual plot. There seems to be no eorrelation from looking at this 
plot, but it is hard to tell sinee there are elose to 35,000 observations. 

To further eheek for eorrelation, a eorrelation matrix is generated in Figure E.4. This 
allows for numerie inspeetion of the residuals with regard to eorrelation. A eorrelation 
value of 1 or -1 indieates a highly eorrelated variable. A variable that is eorrelated tries 
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Monthly Mult Reg Normal Q-Q Plot 
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Figure E.2: Monthly Linear Multiple Regression Model Normal Q-Q Plot 


Monthly Mult Reg Residual plot 
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Figure E.3: Monthly Linear Multiple Regression Model Residual Plot 

to explain the relationship between the response and predietor variable using similar data 
to another variable. QualityStationIndex, QualityStationIndexFemale, PerBot22,PerTop22, 
and PerTop22Female all have signs of eorrelation and were removed from the model. 

To eheek the assumption that there are no influential outliers a Cook’s Distanee Plot is 
ereated in Figure E.5. This plot shows that none of the observations are influential sinee 
they do not exeeed a value of 0.50. 

The last assumption to eheek is that the model is strueturally sound. To do this partial 
residual plots for eaeh predietor variable are inspeeted. These plots are in Figure E.6 to 
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Correlation plot 
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Figure E.4: Monthly Linear Multiple Regression Model Correlation Matrix 


Monthly Mult Reg Model Cook's Distance Plot 
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Figure E.5: Monthly Linear Multiple Regression Model Cook’s Distance Plot 


Figure E.13. Based on the partial residual plots, this model is strueturally sound and there 
is no need to transform any of the predictor variables. 
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Figure E.6: Average Number of Recruiters Partial Residual Plot for Monthly Multiple Regression 
Model 


(/> 

+ 

(/) 

c 

o 

Q 

c 

o 

ro 

13 

Q. 

O 

CL 

* 

ro 

0) 

Xi 


ID 

o 

iD 

o 

m 

I 



PopulationDensity 

Figure E.7: PopulationDensity Partial Residual Plot for Monthly Multiple Regression Model 
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Figure E.8: Percentage of High School Diploma Graduates Partial Residual Plot for Monthly 
Multiple Regression Model 
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Figure E.9: LeadsPerCapita Partial Residual Plot for Monthly Multiple Regression Model 
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E.IO: DistanceToMEPS Partial Residual Plot for Monthly Multiple Regression Model 
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Figure E.ll: X75thPercentile Partial Residual Plot for Monthly Multiple Regression Model 
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Figure E.12: Competition Index ("EMR") Partial Residual Plot for Monthly Multiple Regression 
Model 
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Figure E.13: QMAQualityRatio Partial Residual Plot for Monthly Multiple Regression Model 
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APPENDIX F: 

Additional Poisson Monthly Model Validation 


Figure F. 1 is the correlation matrix for the Poisson model. The correlation matrix allows for 
numeric inspection of the residuals with regard to correlation. A correlation value of 1 or -1 
indicates a highly correlated variable. Figure E.4 shows the correlation matrix. A variable 
that is correlated tries to explain the relationship between the response and predictor vari¬ 
able using similar data to another variable. QualityStationIndex, QualityStationlndexFe- 
male, PerBot22,PerTop22, PerTop22Female, QMAvPopulation, and QMAQualityRatio all 
have signs of correlation and were removed from the model. 
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Figure F.l: Monthly Poisson Model Correlation Matrix 


The next set of plots are the partial residual plots generated for each predictor variable 
within the model. Examining the residual plots determines if any of the predictor variables 
need to be transformed. None of the variables require transformation. 
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Figure F.2: Average Number of Recruiters Partial Residual Plot for Poisson Monthly Model 
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Figure F.3: PopulationDensity Partial Residual Plot for Poisson Monthly Model 
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Figure F.4: Percentage of High School Diploma Graduates Partial Residual Plot for Poisson 
Monthly Model 
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Figure F.5: LeadsPerCapita Partial Residual Plot for Poisson Monthly Model 
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Figure F.6: DistanceToMEPS Partial Residual Plot for Poisson Monthly Model 
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Figure F.7: X75thPercentile Partial Residual Plot for Poisson Monthly Model 
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